ub-MOJI | AI Vision Lab

ub-MOJI

An Open Dataset for Japanese Fingerspelling

A publicly available, temporally annotated video dataset of Japanese fingerspelling for sign language recognition and sequence modeling.

Dataset Details

This section describes the structure of the ub-MOJI dataset, covering its three subsets, video and annotation formats, metadata files, and file naming conventions.

Note: a portion of samples is not publicly released due to participant consent.

Subsets

Three linguistic units are provided, each organized differently on disk.

Subset	Unit	Storage	Annotation
syllables	Single kana characters	Subdirectories by kana	No .toml
sequences	Five-kana sequences	Flat files	.toml for each sample
words	Full Japanese words	Flat files	.toml for each sample

Video & Annotations

Samples are RGB videos, with temporal annotations for sequences and words.

Item	Format	Applies to	Notes
Video	.mp4 (RGB)	All subsets	One file per sample
annotations.toml	.toml	sequences / words	Frame-level timing

Metadata Files

CSV files summarize sample-level and participant-level information.

File	Scope	Key fields
metadata.csv	Per sample	file_name, classes, category, participant_id, recording_date, fps
participants.csv	Per participant	participant_id, age_group, gender, dominant_hand, experience_years, hearing_level, face_visibility

Missing values may appear as -1 for unspecified fields.

File Naming Convention

{content}_{participantID}_{yyyymm}_{take}.mp4

Token	Meaning	Example
content	Kana / sequence / word	a, aiueo, kamakura
participantID	Participant ID	001
yyyymm	Year + month	202403
take	Take number	t001

License

Access is gated on Hugging Face. Agree to the terms, avoid privacy-invasive use, and cite the dataset in publications.

Academic research onlyNon-commercial useNo redistribution

Authors

AI Vision Lab, Tokyo Polytechnic University

Citation

Use the BibTeX below to cite the paper or dataset.

@InProceedings{Murai_2025_ICCV,
    author    = {Murai, Ryota and Tsuta, Naoto and Shin, Duk and Kang, Yousun},
    title     = {Point-Supervised Japanese Fingerspelling Localization via HR-Pro and Contrastive Learning},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV) Workshops},
    month     = {October},
    year      = {2025},
    pages     = {4975-4982}
    doi       = {10.1109/ICCVW69036.2025.00516},
}

@misc{ubmoji2025,
  title        = {ub-MOJI},
  author       = {Kondo, Tamon and Murai, Ryota and Tsuta, Naoto and Kang, Yousun},
  year         = {2025},
  url          = {https://huggingface.co/datasets/kanglabs/ub-MOJI},
  publisher    = {Hugging Face},
}