Schedules Pro
A Schedule represents a programmed report that will be run on a given periodicity, or after a deploy webhook is received.
Think of a Scheduled Report as a regular Report, that can be set to run automatically on a defined schedule or via deploy hooks.
Attributes
- ID
- Unique schedule ID.
- Starting URL
- Initial URL, that the Spider will use as the initial request. The Spider will include the internal links from that starting URL, and then (if Deep Crawl is enabled) recursively include the linked web pages from those, until the Max Pages limit is reached.
- Initial URLs
- A list of URLs to be included on the first run of the Spider. Newline-separated.
- Exclusions
- A list of URLs or partial paths to tell the Spider to skip matching URLs. Newline-separated.
- Perform A11Y Checks
- Boolean to indicate whether or not accessibility checks will be included.
- Perform HTML Checks
- Boolean to indicate whether or not HTML checks will be included.
- Deep Crawl
- Boolean to indicate whether deep crawling was enabled or not. If it's enabled, the Spider witll recursively include more linked pages from the pages it finds, until the Max Pages limit is reached.
- Dynamic Crawler
- Boolean to indicate whether Dynamic Crawler should be used instead of the default static crawler. The Dynamic Crawler renders each web page found using a headless browser, so it's able to find links in JavaScript-powered web pages.
- Device Rotated
- Boolean to indicate if the emulated device viewport is rotated.
- Max Pages
- Maximum number of web pages to include. Places a limit on the Spider.
- Periodicity
-
Shows when the schedule will be run, using the keys
every
, which can be one ofday / week / month
.- If
week
is used, thenweekday
will indicate the week day the schedule is run. - If
month
is used, the fieldmonthday
will give the details for the day of the month the schedule will be run on. - If this field is
null
, the schedule will never be run, except through deploy hooks.
- If
- Rate Limit
- Maximum allowed requests per second.
- Active
- Boolean to enable or disable the schedule. Only active ones will be run.
- Tags
- Comma-separated list of tags to categorize this schedule.
- Inserted At
- Timestamp when the schedule was created.
- Updated At
- Timestamp when the schedule was last updated.
- Last Run At
- Timestamp when the schedule was last run.
Relationships
- Reports
- The list of reports created via this schedule.
- Device
- The emulated device viewport used in the accessibility checks.
Example
Example: Schedule example
{
"data": {
"attributes": {
"active": true,
"deep_crawl": true,
"device_rotated": false,
"dynamic_crawler": false,
"exclusions": [
"/news",
"/tour"
],
"initial_urls": [
"https://github.blog/category/engineering/",
"https://github.blog/category/open-source/"
],
"inserted_at": "2022-06-18T10:09:10",
"last_run_at": "2022-08-15T08:51:25",
"max_pages": 100,
"perform_a11y_checks": true,
"perform_html_checks": true,
"periodicity": {
"every": "month",
"monthday": 15
},
"rate_limit": 3,
"starting_url": "https://github.blog/",
"tags": [
"personal",
"scheduled"
],
"updated_at": "2024-05-15T08:35:02"
},
"id": "2d8cc37a-1467-493b-8660-f97e33ca2c0a",
"relationships": {
"device": {
"links": {
"related": "http://rocketapi.dev:4000/api/v1/devices/default"
}
},
"reports": {
"links": {
"related": "http://rocketapi.dev:4000/api/v1/reports?filter[schedule_id]=2d8cc37a-1467-493b-8660-f97e33ca2c0a"
}
}
},
"type": "schedule"
},
"jsonapi": {
"version": "1.0"
}
}
Create a Schedule
To create a Schedule, send a POST
request to /api/v1/schedules
, with a JSON payload in the body including its attributes:
starting_url
. The initial URL where the Spider will start on.periodicity
. Map with the options for the periodicity. Requires anevery
key which can bedeploy
,month
,week
orday
.- If
month
is used, an additional keymonthday
is optional, which has to be an integer from 1 to 28 and defaults to 1. - If instead
week
is used, then an additionalweekday
key is optional, as a string frommonday
,tuesday
,wednesday
,thursday
,friday
,saturday
orsunday
that defaults tomonday
.
- If
Optional attributes
max_pages
. The Spider will recursively follow internal links found until this limit is reached. Defaults to 100.rate_limit
. Limit on the number of requests per second. Defaults to 1.perform_html_checks
. Boolean to enable checks using the W3C Validator software on the Web Pages found. Defaults to true.perform_a11y_checks
. Boolean to enable checks using Deque Axe Core software on the Web Pages found. Defaults to false.deep_crawl
. Boolean to enable deep crawling. Defaults to true.dynamic_crawler
. Boolean to use the Dynamic Crawler (for JS apps) instead of the default static crawler. Defaults to false.active
. Boolean to enable the schedule. Defaults to true.initial_urls
. Newline-separated list of URLs.exclusions
. Newline-separated list of paths.device_id
. Id of the device to be used for viewport emulation. Check the device list to see the available devices.device_rotated
. Boolean to indicate the emulated device should be rotated. Defaults to false.tags
. Comma-separated list of tags.
The next example shows how to form the body payload with the Schedule attributes.
Example: POST /api/v1/schedules
{
"data": {
"attributes": {
"starting_url": "https://dummy.rocketvalidator.com",
"max_pages": 100,
"rate_limit": 3,
"perform_html_checks": true,
"perform_a11y_checks": true,
"deep_crawl": true,
"dynamic_crawler": false,
"active": true,
"periodicity": {
"every": "month",
"monthday": 15
},
"tags": "dev,dummy",
"device_id": "c4f0f4be-e6dd-498a-b049-205be3604505",
"device_rotated": true,
}
}
}
Update a Schedule
To update an existing Schedule, send a PATCH
request to /api/v1/schedules/$schedule_id
, with a JSON payload in the body including the attributes you want to change.
Retrieve a Schedule
To retrieve an individual Schedule in your account, send a GET
request to /api/v1/schedules/$SCHEDULE_ID
.
Delete a Schedule
To delete an individual Schedule from your account, send a DELETE
request to /api/v1/schedules/$SCHEDULE_ID
.
Example: DELETE /api/v1/schedules/$SCHEDULE_ID
204 No Content
List your Schedules
To list all Schedules in your account, send a GET
request to /api/v1/schedules
.
Filtering by URL
To include only the Schedules for a given starting_url
, use the filter[url]
option.
Example: return all schedules with url containing "dummy.rocketvalidator.com"
GET /api/v1/schedules?filter[url]=dummy.rocketvalidator.com
Filtering by tag
To include only the Schedules for a given tags
combination, use the filter[tags]
options:
filter[tags][mode]
setting the tag combination mode, which can beany
,all
ornone
.filter[tags][list]
including a comma-separated list of tags.
Example: return all schedules tagged with any of "dev" or "dummy"
GET /api/v1/schedules?filter[tags][mode]=any&filter[tags][list]=dev,dummy
Filter Reports by Schedule
To list all the Reports in your account that have been created via a given Schedule, refer to the Reports documentation.