Datasets Changes New: Add Russian SuperGLUE #2668 (@slowwavesleep) New: Add Disfl-QA #2473 (@bhavitvyamalik) New: Add TimeDial #2476 (@bhavitvyamalik) Fix: Enumerate all ner_tags values in WNUT 17 dataset #2713 (@albertvillanova) Fix: Update WikiANN data URL #2710 (@albertvillanova) Fix: Update PAN-X data URL in XTREME dataset #2715 (@albertvillanova) Fix: C4 - en subset by modifying dataset_info with correct validation infos #2723 (@thomasw21) General improvements and bug fixes fix: change string format to allow copy/paste to work in bash #2694 (@severo) Update BibTeX entry #2706 (@albertvillanova) Print absolute local paths in load_dataset error messages #2684 (@mariosasko) Add support for disable_progress_bar on Windows...
Fix minimum tqdm version and import on Colab #2697 (@nateraw) Fix OSCAR Esperanto #2693 (@lhoestq
Bug fixes Fix error related to huggingface_hub timeout parameter #3082 (@albertvillanova) Remove...
Bug fixes Fix filter indices when batched by @albertvillanova in https://github.com/huggingface/dat...
Dataset Changes New: NLU evaluation data #2238 (@dkajtoch) New: Add SLR32, SLR52, SLR53 to OpenS...
Datasets Changes New: Microsoft CodeXGlue Datasets #2357 (@ncoop57) New: KLUE benchmark #2416 (@...
Datasets Changes New: C4 #2575 #2592 (@lhoestq) New: mC4 #2576 (@lhoestq) New: MasakhaNER #2465...
Bug fixes Prioritize module.builder_kwargs over defaults in TestCommand #3672 (@lvwerra) Fix TestCo...
New documentation New documentation structure #2718 (@stevhliu): New: Tutorials New: Hot-to...
Dataset changes Update: LexGLUE and MultiEURLEX README - update dataset cards #3075 (@iliaschalki...
Dataset changes New: CaSiNo #2867 (@kushalchawla) New: Mostly Basic Python Problems #2893 (@lvwe...
Datasets fixes Fix: irc_disentangle - fix checksum and bug dataset by @albertvillanova in https://g...
Bug fixes Fix MP3 resampling when a dataset's audio files have different sampling rates by @lhoestq...
Bug fixes Fix streaming datasets that are not reset correctly by @lhoestq in https://github.com/hug...
Dataset changes Update: Adapt all audio datasets #3081 (@patrickvonplaten) Bug fixes Update BibTe...
Datasets Features Support remote data files #2616 (@albertvillanova) This allows to pass URLs of ...
Fix minimum tqdm version and import on Colab #2697 (@nateraw) Fix OSCAR Esperanto #2693 (@lhoestq
Bug fixes Fix error related to huggingface_hub timeout parameter #3082 (@albertvillanova) Remove...
Bug fixes Fix filter indices when batched by @albertvillanova in https://github.com/huggingface/dat...
Dataset Changes New: NLU evaluation data #2238 (@dkajtoch) New: Add SLR32, SLR52, SLR53 to OpenS...
Datasets Changes New: Microsoft CodeXGlue Datasets #2357 (@ncoop57) New: KLUE benchmark #2416 (@...
Datasets Changes New: C4 #2575 #2592 (@lhoestq) New: mC4 #2576 (@lhoestq) New: MasakhaNER #2465...
Bug fixes Prioritize module.builder_kwargs over defaults in TestCommand #3672 (@lvwerra) Fix TestCo...
New documentation New documentation structure #2718 (@stevhliu): New: Tutorials New: Hot-to...
Dataset changes Update: LexGLUE and MultiEURLEX README - update dataset cards #3075 (@iliaschalki...
Dataset changes New: CaSiNo #2867 (@kushalchawla) New: Mostly Basic Python Problems #2893 (@lvwe...
Datasets fixes Fix: irc_disentangle - fix checksum and bug dataset by @albertvillanova in https://g...
Bug fixes Fix MP3 resampling when a dataset's audio files have different sampling rates by @lhoestq...
Bug fixes Fix streaming datasets that are not reset correctly by @lhoestq in https://github.com/hug...
Dataset changes Update: Adapt all audio datasets #3081 (@patrickvonplaten) Bug fixes Update BibTe...
Datasets Features Support remote data files #2616 (@albertvillanova) This allows to pass URLs of ...
Fix minimum tqdm version and import on Colab #2697 (@nateraw) Fix OSCAR Esperanto #2693 (@lhoestq
Bug fixes Fix error related to huggingface_hub timeout parameter #3082 (@albertvillanova) Remove...
Bug fixes Fix filter indices when batched by @albertvillanova in https://github.com/huggingface/dat...