Bug fixes Fix MP3 resampling when a dataset's audio files have different sampling rates by @lhoestq in https://github.com/huggingface/datasets/pull/3665 Extend dataset builder for streaming in get_dataset_split_names by @mariosasko in https://github.com/huggingface/datasets/pull/3657 Dataset changes New: Turkic X-WMT evaluation set for machine translation by @mirzakhalov in https://github.com/huggingface/datasets/pull/3605 New: British Library books dataset by @davanstrien in https://github.com/huggingface/datasets/pull/3603 Fix: wiki_bio - Update link by @jxmorris12 in https://github.com/huggingface/datasets/pull/3651 Other improvements sp. Columbia => Colombia by @serapio in https://github.com/huggingface/datasets/pull/3652 Run pyupg...
Datasets bug fixes Fix cnn_dailymail (dm stories were ignored) by @lhoestq in https://github.com/hu...
Bug fixes Fix patching module that doesn't exist by @lhoestq in https://github.com/huggingface/data...
Dataset Changes New: NLU evaluation data #2238 (@dkajtoch) New: Add SLR32, SLR52, SLR53 to OpenS...
Bug fixes Fix streaming datasets that are not reset correctly by @lhoestq in https://github.com/hug...
Improvements Make decoding of Audio and Image feature optional by @mariosasko in https://github.com...
Dataset changes Update: Adapt all audio datasets #3081 (@patrickvonplaten) Bug fixes Update BibTe...
Datasets fixes Fix: irc_disentangle - fix checksum and bug dataset by @albertvillanova in https://g...
Bug fixes Prioritize module.builder_kwargs over defaults in TestCommand #3672 (@lvwerra) Fix TestCo...
Dataset changes Update: LexGLUE and MultiEURLEX README - update dataset cards #3075 (@iliaschalki...
Datasets Changes New: Microsoft CodeXGlue Datasets #2357 (@ncoop57) New: KLUE benchmark #2416 (@...
Bug fixes Fix import datasets on python 3.10 by @lhoestq in https://github.com/huggingface/datasets...
Bug fixes Fix error related to huggingface_hub timeout parameter #3082 (@albertvillanova) Remove...
Datasets Changes New: C4 #2575 #2592 (@lhoestq) New: mC4 #2576 (@lhoestq) New: MasakhaNER #2465...
Bug fixes Fix filter indices when batched by @albertvillanova in https://github.com/huggingface/dat...
Datasets Changes New: Add Russian SuperGLUE #2668 (@slowwavesleep) New: Add Disfl-QA #2473 (@bha...
Datasets bug fixes Fix cnn_dailymail (dm stories were ignored) by @lhoestq in https://github.com/hu...
Bug fixes Fix patching module that doesn't exist by @lhoestq in https://github.com/huggingface/data...
Dataset Changes New: NLU evaluation data #2238 (@dkajtoch) New: Add SLR32, SLR52, SLR53 to OpenS...
Bug fixes Fix streaming datasets that are not reset correctly by @lhoestq in https://github.com/hug...
Improvements Make decoding of Audio and Image feature optional by @mariosasko in https://github.com...
Dataset changes Update: Adapt all audio datasets #3081 (@patrickvonplaten) Bug fixes Update BibTe...
Datasets fixes Fix: irc_disentangle - fix checksum and bug dataset by @albertvillanova in https://g...
Bug fixes Prioritize module.builder_kwargs over defaults in TestCommand #3672 (@lvwerra) Fix TestCo...
Dataset changes Update: LexGLUE and MultiEURLEX README - update dataset cards #3075 (@iliaschalki...
Datasets Changes New: Microsoft CodeXGlue Datasets #2357 (@ncoop57) New: KLUE benchmark #2416 (@...
Bug fixes Fix import datasets on python 3.10 by @lhoestq in https://github.com/huggingface/datasets...
Bug fixes Fix error related to huggingface_hub timeout parameter #3082 (@albertvillanova) Remove...
Datasets Changes New: C4 #2575 #2592 (@lhoestq) New: mC4 #2576 (@lhoestq) New: MasakhaNER #2465...
Bug fixes Fix filter indices when batched by @albertvillanova in https://github.com/huggingface/dat...
Datasets Changes New: Add Russian SuperGLUE #2668 (@slowwavesleep) New: Add Disfl-QA #2473 (@bha...
Datasets bug fixes Fix cnn_dailymail (dm stories were ignored) by @lhoestq in https://github.com/hu...
Bug fixes Fix patching module that doesn't exist by @lhoestq in https://github.com/huggingface/data...
Dataset Changes New: NLU evaluation data #2238 (@dkajtoch) New: Add SLR32, SLR52, SLR53 to OpenS...