Contents¶
Overview¶
docs |
|
---|---|
tests |
|
package |
A Discord Notifier to send progress updates, params and results to a Discord channel.
Free software: MIT license
Installation¶
pip install transformer-discord-notifier
You can also install the in-development version with:
pip install https://github.com/Querela/python-transformer-discord-notifier/archive/master.zip
Documentation¶
https://python-transformer-discord-notifier.readthedocs.io/
git clone https://github.com/Querela/python-transformer-discord-notifier.git
cd python-transformer-discord-notifier
sphinx-build -b html docs dist/docs
Development¶
To run all the tests run:
tox
Note, to combine the coverage data from all the tox environments run:
Windows |
set PYTEST_ADDOPTS=--cov-append
tox
|
---|---|
Other |
PYTEST_ADDOPTS=--cov-append tox
|
Usage¶
Using DiscordProgressCallback
¶
How to use the DiscordProgressCallback
in a huggingface.co Transformer in a project/training script:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 | from transformers import Trainer
# ... other import ...
from transformer_discord_notifier import DiscordProgressCallback
def run_trainer():
# ... set up things beforehand ...
# Initialize the Discord bot
dpc = DiscordProgressCallback(token=None, channel=None, create_experiment_channels=False)
dpc.start()
# Initialize our Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
# ...
# add our callback to the trainer
callbacks=[dpc]
)
# ... do things like train/eval/predict
# shutdown our discord handler as it would continue to run indefinitely
dpc.end()
|
Alternatively, since version v0.2.0 it is possible to omit the starting and stopping of the DiscordProgressCallback
, and it can be used like any other huggingface.co callback handler:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | from transformers import Trainer
# ... other import ...
from transformer_discord_notifier import DiscordProgressCallback
def run_trainer():
# ... set up transformer stuff beforehand ...
# Initialize our Trainer
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
# ...
# add our callback to the trainer
callbacks=[DiscordProgressCallback]
)
# ... do things like train/eval/predict
# ... when the trainer instance is garbage collected, it will clean up the Discord bot
|
Note, however, that the both token
and channel
should be provided, either as class initialization parameters or as environment variables, DISCORD_TOKEN
and DISCORD_CHANNEL
. The handler will try to load from environment variables if the instance properties are None
. Both should be explicitely provided to have it working correctly!
Since version v0.5.0, we included the ability to create separate experiment channels. To enable those, set the following environment variables:
# yes|y|true|t|1 will enable the channel creation
export DISCORD_CREATE_EXPERIMENT_CHANNEL=yes
# (optional) set (or create) a category channel for the experiment channels
export DISCORD_EXPERIMENT_CATEGORY="All my Experiments"
# (optional) override and set the run_name / experiment name
# new text channel name, note, that it will be all lowercase, "-" for whitespaces
export DISCORD_EXPERIMENT_NAME="Experiment-Run-A1B2"
How to setup a Discord bot¶
How to setup a Discord bot, how to get the token or the channel id? Please visit the following links:
Related project discord-notifier-bot, setup guide in README
Reference¶
transformer_discord_notifier¶
Imports DiscordClient
and DiscordProgressCallback
.
transformer_discord_notifier.discord¶
-
class
transformer_discord_notifier.discord.
DiscordClient
(token: Optional[str] = None, channel: Optional[Union[str, int]] = None, create_experiment_channels: Optional[bool] = None, experiment_category: Optional[str] = None, experiment_name: Optional[str] = None)[source]¶ A blocking wrapper around the asyncio Discord.py client.
-
_experiment_category_channel
: Optional[discord.channel.CategoryChannel]¶ stores the category channel instance
-
_load_credentials
() → None[source]¶ Try to load missing Discord configs (token, channel) from environment variables.
-
_find_default_channel
(name: Optional[str] = None, default_name: str = 'default') → int[source]¶ Try to find a writable text channel.
Follow the following algorithm:
if
name
is being provided, search for this channel firstif not found, search for
self._discord_channel
, then channel that can be configured on instance creation or by loading environment variables. Check first for a channel with the given name as string, then fall back to an integer channel id.if still not found, search for a channel with a given default name, like “default” or “Allgemein”. As this seems to depend on the language, it might not find one.
If after all this still no channel has been found, either because no channel with the given names/id exists, or because the Discord token gives no acces to guilds/channels which we have access to, we throw a
RuntimeError
. We now can’t use this callback handler.- Parameters
name (Optional[str], optional) – channel name to search for first, by default None
default_name (str, optional) – alternative default Discord channel name, by default “default”
- Returns
int – channel id
- Raises
RuntimeError – raised if no guild Discord server found (i.e. Discord bot has no permissions / was not yet invited to a Discord server)
RuntimeError – raised if channel could not be found
-
init
()[source]¶ Initialize Discord bot for accessing Discord/writing messages.
It loads the credentials, starts the asyncio Discord bot in a separate thread and after connecting searches for our target channel.
- Raises
RuntimeError – raised on error while initializing the Discord bot, like invalid token or channel not found, etc.
-
set_experiment_channel_name
(name: str, overwrite: bool = False) → None[source]¶ Set experiment channel name. Create channel if it does not exist.
If override is
False
then a previously set experiment name, e. g. via environment variables, will not be overwritten.- Parameters
name (str) – Name of the experiment (channel name)
overwrite (bool, optional) – whether to override existing channel name, by default False
- Raises
RuntimeError – raised if experiment channel creation failed
-
_quit_client
()[source]¶ Internal. Try to properly quit the Discord client if neccessary, and close the asyncio loop if required.
-
quit
()[source]¶ Shutdown the Discord bot.
# exceptions: # concurrent.futures._base.TimeoutError
Tries to close the Discord bot safely, closes the asyncio loop, waits for the background thread to stop (deamonized, so on program exit it will quit anyway).
-
send_message
(text: str = '', embed: Optional[discord.embeds.Embed] = None) → Optional[int][source]¶ Sends a message to our Discord channel. Returns the message id.
- Parameters
text (str, optional) – text message to send, by default “”
embed (Optional[discord.Embed], optional) – embed object to attach to message, by default None
- Returns
Optional[int] – message id if text and embed were both not
None
,None
if nothing was sent
-
get_message_by_id
(msg_id: Optional[int]) → Optional[discord.message.Message][source]¶ Try to retrieve a Discord message by its id.
- Parameters
msg_id (Optional[int]) – message id of message sent in Discord channel, if it is
None
then it will be ignore, andNone
returned- Returns
Optional[discord.Message] –
None
if message could not be found by msg_id, else return the message object
-
update_or_send_message
(msg_id: Optional[int] = None, **fields) → Optional[int][source]¶ Wrapper for
send_message()
to updated an existing message, identified by msg_id or simply send a new message if no prior message found.- Parameters
msg_id (Optional[int], optional) – message id of prior message sent in channel, if not provided then send a new message.
text (str, optional) – text message, if set to
None
it will remove prior message contentembed (Optional[discord.Embed], optional) – Discord embed, set to
None
to delete existing embed
- Returns
Optional[int] – message id of updated or newly sent message,
None
if nothing was sent
-
delete_later
(msg_id: Optional[int], delay: Union[int, float] = 5) → bool[source]¶ Runs a delayed message deletion function.
- Parameters
msg_id (Optional[int]) – message id of message sent in Discord channel, if message is None it will be silently ignored
delay (Union[int, float], optional) – delay in seconds for then to delete the message, by default 5
- Returns
bool –
True
if message deletion is queued,False
if message could not be found in channel
-
static
build_embed
(kvs: Dict[str, Any], title: Optional[str] = None, footer: Optional[str] = None) → discord.embeds.Embed[source]¶ Builds an rich Embed from key-values.
- Parameters
kvs (Dict[str, Any]) – Key-Value dictionary for embed fields, non
int
/float
values will be formatted withpprint.pformat()
title (Optional[str], optional) – title string, by default None
footer (Optional[str], optional) – footer string, by default None
- Returns
discord.Embed – embed object to send via
send_message()
-
The DiscordClient
can be used standalone, but it might be easier to just extract the module code to avoid having to install all the related transformers requirements. It wraps the asyncio Discord.py client inside a background thread and makes its calls essentially blocking. This eases the usage of it in foreign code that does not uses asyncio.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | from transformer_discord_notifier.discord import DiscordClient
# configuration
token = "abc123.xyz..."
channel = "Allgemein"
# create client and start background thread, to connect/login ...
# if token/channel are None, it will try to load from environment variables
client = DiscordClient(token=token, channel=channel, create_experiment_channels=False)
client.init()
# send message
msg_id = client.send_message("test")
# update message content
msg_id = client.update_or_send_message(text="abc", msg_id=msg_id)
# delete it after 3.1 seconds,
# NOTE: this call will not block!
client.delete_later(msg_id, delay=3.1)
# quit client (cancel outstanding tasks!, quit asyncio thread)
client.quit()
|
transformer_discord_notifier.transformers¶
-
class
transformer_discord_notifier.transformers.
DiscordProgressCallback
(token: Optional[str] = None, channel: Optional[Union[str, int]] = None)[source]¶ Bases:
transformers.trainer_callback.ProgressCallback
An extended
transformers.trainer_callback.ProgressCallback
that logs training and evaluation progress and statistics to a Discord channel.- Variables
client (DiscordClient) – a blocking Discord client
disabled (bool) –
True
if Discord client couldn’t not be initialized successfully, all callback methods are disabled silently
- Parameters
token (Optional[str], optional) – Discord bot token, by default None
channel (Optional[Union[str, int]], optional) – Discord channel name or numeric id, by default None
-
on_init_end
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of the initialization of the
Trainer
.
-
_new_tqdm_bar
(desc: str, msg_fmt: str, delete_after: bool = True, **kwargs) → Tuple[tqdm.std.tqdm, transformer_discord_notifier.transformers.MessageWrapperTQDMWriter][source]¶ Builds an internal
tqdm
wrapper for progress tracking.Patches its
file.write
method to forward it to Discord. Tries to update existing messages to avoid spamming the channel.
-
on_train_begin
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the beginning of training.
-
on_prediction_step
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, eval_dataloader=None, **kwargs)[source]¶ Event called after a prediction step.
-
on_step_end
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of a training step. If using gradient accumulation, one training step might take several inputs.
-
on_epoch_begin
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the beginning of an epoch.
-
on_epoch_end
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of an epoch.
-
on_train_end
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called at the end of training.
-
on_evaluate
(args: transformers.training_args.TrainingArguments, state: transformers.trainer_callback.TrainerState, control: transformers.trainer_callback.TrainerControl, **kwargs)[source]¶ Event called after an evaluation phase.
-
_send_log_results
(logs: Dict[str, Any], state: transformers.trainer_callback.TrainerState, args: transformers.training_args.TrainingArguments, is_train: bool) → Optional[int][source]¶ Formats current log metrics as Embed message.
Given a huggingface transformers Trainer callback parameters, we create an
discord.Embed
with the metrics as key-values. Send the message and returns the message id.
Contributing¶
Contributions are welcome, and they are greatly appreciated! Every little bit helps, and credit will always be given.
Bug reports¶
When reporting a bug please include:
Your operating system name and version.
Any details about your local setup that might be helpful in troubleshooting.
Detailed steps to reproduce the bug.
Documentation improvements¶
Transformer Discord Notifier could always use more documentation, whether as part of the official Transformer Discord Notifier docs, in docstrings, or even on the web in blog posts, articles, and such.
Feature requests and feedback¶
The best way to send feedback is to file an issue at https://github.com/Querela/python-transformer-discord-notifier/issues.
If you are proposing a feature:
Explain in detail how it would work.
Keep the scope as narrow as possible, to make it easier to implement.
Remember that this is a volunteer-driven project, and that code contributions are welcome :)
Development¶
To set up python-transformer-discord-notifier for local development:
Fork python-transformer-discord-notifier (look for the “Fork” button).
Clone your fork locally:
git clone git@github.com:YOURGITHUBNAME/python-transformer-discord-notifier.git
Create a branch for local development:
git checkout -b name-of-your-bugfix-or-feature
Now you can make your changes locally.
When you’re done making changes run all the checks, tests and rebuild docs:
python setup.py check --strict --metadata --restructuredtext check-manifest flake8 isort --verbose --check-only --diff --filter-files src sphinx-build -b doctest docs dist/docs sphinx-build -b html docs dist/docs sphinx-build -b linkcheck docs dist/docs pytest
Or you can use tox to automatically run those commands:
tox
or just a single test:
tox -e check,docs tox -e py38
Note, that the tests with
pytest
require a valid Discord token and channel. They must be provided as--discord-token token
,--discord-channel chan
orDISCORD_TOKEN=token
,DISCORD_CHANNEL=chan
.You can set the environment variables in the
.env
file to make them visible to bothpytest
andtox
environments.If you use VSCode, configure it to use an environment variable file, in
.vscode/settings.json
the setting"python.envFile": "${workspaceFolder}/.env"
.Commit your changes and push your branch to GitHub:
git add . git commit -m "Your detailed description of your changes." git push origin name-of-your-bugfix-or-feature
Submit a pull request through the GitHub website.
Pull Request Guidelines¶
If you need some code review or feedback while you’re developing the code just make the pull request.
For merging, you should:
RUn extensive tests.
Update documentation when there’s new API, functionality etc.
Add a note to
CHANGELOG.rst
about the changes.Add yourself to
AUTHORS.rst
.
Authors¶
Erik Körner - koerner@informatik.uni-leipzig.de
Changelog¶
0.x.0 (WIP)¶
ignore linkcheck with version tag (if tags have not been pushed it will fail)
Blocking message deletion?
0.5.0 (2021-02-04)¶
Add dynamic experiment channel creation.
TODO: update docs and tests with better examples for experiment channels.
0.4.5 (2021-02-04)¶
Wrap common errors, like 5xx Discord Gateway errors, to allow uninterrupted training.
Add python3.10 to tests / github workflows.
0.4.4 (2020-12-22)¶
Github Actions - tox tests
0.4.3 (2020-12-18)¶
Github Actions - pypi publishing
0.4.2 (2020-12-18)¶
Add travis build jobs.
Add coveralls coverage statistics.
0.4.1 (2020-12-17)¶
Reintroduce tests with
pytest
andtox
.Add simple tests for
DiscordClient
.Add tests for
DiscordProgressCallback
.
0.3.1 (2020-12-17)¶
Let Discord bot gracefully handle initialization failures.
Let transformer callback handler handle invalid configs gracefully, to simply exit.
Better handling of edge cases of Discord client login.
0.3.0 (2020-12-16)¶
Add (private) scripts (make venv, run checks).
Update usage docs.
Extend / rewrite discord client methods.
Reuse existing
tqdm
transformers.trainer_callback.ProgressCallback
for progress tracking.Fancy aggregation of prediction runs, split train progress into epochs.
0.2.1 (2020-12-15)¶
Correct
setup.py
validation.Add (private) distribution/docs build scripts.
0.2.0 (2020-12-15)¶
0.1.0 (2020-12-11)¶
First release on PyPI.
First working version, tested manually.
Cleaned up skeleton files.
Updated docs.
0.0.0 (2020-12-10)¶
Initial code skeleton.