I. Basic concepts
APScheduler
full nameAdvanced Python Scheduler
The effect is to execute the specified job at the specified time rule.
- The way to specify the time rule can be how long the interval between the execution, can be specified date and time of the execution, can also be similar to the Linux system in the Crontab in the way to execute the task.
- The specified task is a Python function.
1.1. Triggers: Triggers
Used to set the conditions for triggering a task.
Triggers contain scheduling logic. Each task has its own trigger for determining when a job should be run. Triggers are completely stateless except for initial configuration
1.2, job stores: job stores
Used to store tasks in memory or in a database
- By default, tasks are stored in memory. It can also be configured to be stored in a different type of database. If the tasks are stored in a database, there is a serialization and deserialization process for accessing the tasks, as well as modifying and searching the tasks is implemented by the task store.
- Be careful that a task store is not shared to multiple schedulers, as this can lead to state confusion
1.3. executors executors
Used to execute tasks, you can set the execution mode to single thread or thread pool:
Tasks are put into a thread pool or process pool by the executor to be executed, and the executor notifies the scheduler when the execution is complete.
1.4. schedulers schedulers
Take the above three components as parameters and run them by creating a scheduler instance:.
A scheduler consists of the three components above, and generally speaking, a program can be used with just one scheduler. Developers also do not have to directly manipulate the task store, executor and trigger, because the scheduler provides a unified interface, through the scheduler can operate components, such as task additions, deletions and checks.
II. Scheduler details
-
BlockingScheduler
: Blocking scheduler: for programs that only run the scheduler. -
BackgroundScheduler
: background scheduler: for non-blocking cases, the scheduler runs independently in the background -
AsyncIOScheduler
: AsyncIO scheduler for applications using AsnycIO. -
GeventScheduler
: Gevent scheduler for cases where the application passes through Gevent. -
TornadoScheduler
: Tornado scheduler for building Tornado applications. -
TwistedScheduler
:Twisted scheduler for building Twisted applications. -
QtScheduler
: Qt scheduler for building Qt applications.
2.1. APScheduler has three built-in triggers
-
date
:: Date: the specific date on which the task was triggered to run -
interval
: Interval: the time interval at which the task is triggered to run -
cron
:: Cycle: the cycle that triggers the task to run
2.2. Trigger public parameters
-
id
:: Uniqueness of the ID of the initiating task -
name
: Set the name of the startup task -
coalesce
: When for some reason a job has accumulated several times without actually running (for example, if the system hangs for 5 minutes and then recovers, and there is a task that runs once a minute, it is reasonable to say that it was "planned" to run 5 times in those 5 minutes, but it didn't), if coalesce is True, the next time the job is submitted to the executor, it will only be executed 1 time, that is, the last time, if it is False, then it will be executed 5 times (not necessarily, because there are other conditions as well. If coalesce is True, the next time the job is submitted to the executor, it will only be executed 1 time, that is, the last time, and if it is False, then it will be executed 5 times (not necessarily, because there are other conditions, see the explanation of misfire_grace_time later). -
max_instance
That is, there are at most several instances of the same job running at the same time, for example, a job that takes 10 minutes to run and is specified to run once every minute, if we set max_instance to 5, then on the 6th to 10th minute, a new instance of the run won't be executed because there are 5 instances already running. -
misfire_grace_time
:Imagine a similar scenario to coalesce above, if a job was supposed to have an execution at 14:00, but for some reason it didn't get scheduled on, and now it's 14:01, and when this 14:00 run instance is submitted, it will be checked for the difference between the time it was booked to run and the current time (in this case 1 minute), and it's greater than the 30-second limit that we've set, then this run instance will not be executed. -
replace_existing
: If the scheduled job is in a persistent memory, when initializing the application, you must define a display ID for the job and use thereplace_existing=True
, otherwise every time the application restarts it will get a new copy of that job
2.3. date built-in triggers
date is the most basic type of scheduling, where a job task is executed only once. It is triggered at a specific point in time. Its parameters are as follows.
parameters | clarification |
---|---|
run_date (datetime or str) | Date or time the operation was run |
timezone ( or str) | Appointment of time zones |
from datetime import datetime from datetime import date from import BlockingScheduler def job(text): print(text) scheduler = BlockingScheduler() # Run the job method once on 2019-8-30 scheduler.add_job(job, 'date', run_date=date(2022, 4, 9), args=['text1'], , coalesce=True, max_instances=1) # Run the job method once in 2019-8-30 01:00:00 scheduler.add_job(job, 'date', run_date=datetime(2022, 4, 9, 17, 40, 58), args=['text2'], , coalesce=True, max_instances=1) # Run the job method once at 2019-8-30 01:00:01 scheduler.add_job(job, 'date', run_date='2022-4-9 17:41:00', args=['text3'], , coalesce=True, max_instances=1) ()
2.4. interval cycle trigger task
parameters | clarification |
---|---|
weeks (int) |
weeks |
days (int) |
few days |
hours (int) |
hours |
minutes (int) |
A few minutes apart. |
seconds (int) |
How many seconds? |
start_date (datetime or str) |
Start date |
end_date (datetime or str) |
End date |
timezone ( or str) |
time zones |
@sched.scheduled_job( "interval", id=spider_job_name + "_bg_data", coalesce=True, max_instances=1, minutes=20 ) def tick_rzjg_detail_xq(): """ Quick Finish :return. """ each = "rzjg_bg_data" cmd_str = f"cd {ROOT} && bash run_spider.sh {each} --loglevel=INFO" print(cmd_str) (cmd_str) def func(): print("Press Ctrl+C to exit") # Directly triggered once tick_rzjg_detail_xq() try: () except (KeyboardInterrupt, SystemExit): pass if __name__ == "__main__": func()
2.5. cron Trigger Trigger periodically at a specific time, and Linux crontab format compatible.
It is the most powerful trigger
parameters | clarification |
---|---|
year (int or str) |
Year, 4 digits |
month (int or str) |
Months (range 1-12) |
day (int or str) |
Days (range 1-31) |
week (int or str) |
Weeks (range 1-53) |
day_of_week (int or str) |
Day of the week or day of the week (range 0-6 or mon,tue,wed,thu,fri,sat,sun) |
hour (int or str) |
Time (range 0-23) |
minute (int or str) |
Score (range 0-59) |
second (int or str) |
Seconds (range 0-59) |
start_date (datetime or str) |
Earliest start date (inclusive) |
end_date (datetime or str) |
Latest closing time (included) |
timezone ( or str) |
Appointment of time zones |
displayed formula | Parameter type | descriptive |
---|---|---|
* | possess | Wildcards. Example: minutes=* means trigger every minute. |
*/a | possess | Wildcards divisible by a |
a-b | possess | Range a-b trigger |
a-b/c | possess | Triggered when range a-b and divisible by c |
xth y | date | Triggered on the first day of the week. x is the first day and y is the day of the week. |
last x | date | Trigger on the last weekday of the month. |
last | date | Trigger on the last day of the month |
x,y,z | possess | Combination expressions, which can be combined to determine the value or the expression above it |
import time from import BlockingScheduler def job(text): t = ('%Y-%m-%d %H:%M:%S', (())) print('{} --- {}'.format(text, t)) scheduler = BlockingScheduler() # Run the job method every minute at 22:00 every day # scheduler.add_job(job, 'cron', hour=22, minute='*/1', args=['job1']) # Run the job method once a day at 22 and 23:25. scheduler.add_job(job, 'cron', hour='22-23', minute='25', args=['job2']) # At 8:00 every day, run the job method once. scheduler.add_job(job, 'cron', hour='8', args=['job2']) # Run the job method once a day at 8:00 and once a day at 20:00 to set the maximum number of instances to run. scheduler.add_job(job, 'cron', hour='8, 20', minute=30, max_instances=4) ()
This article about Python APScheduler Timing Tasks is here, for more related Python APScheduler Timing Tasks content, please search my previous articles or continue to browse the following related articles I hope you will support me more in the future!