The National Institute for Materials Science (NIMS) has developed Research Data Express (RDE), a data management system designed to automate the processing of experimental data and create AI-ready datasets for materials research. Published in Science and Technology of Advanced Materials: Methods, this system addresses significant challenges in a field where research generates vast amounts of data that often exists in manufacturer-specific formats with inconsistent terminology, making aggregation, comparison, and reuse difficult.
Traditional materials research requires researchers to spend considerable time on tedious tasks such as format conversion, metadata assignment, and characteristics extraction. These extra steps frequently discourage data sharing, hindering the advancement of data-driven work. The problem has become increasingly acute as the field relies more heavily on AI-driven materials discovery, which demands high-quality, standardized datasets. RDE automatically interprets experimental data from raw files and manually inputted measurements, then restructures and stores this information in a format with enhanced readability.
"RDE significantly reduces the burden of routine data processing for researchers and enhances data findability, interoperability, reusability (the FAIR principles), and traceability," explains Jun Fujima, corresponding author and researcher at NIMS's Materials Data Platform. "We hope this will promote collaborative, data-driven materials research." The system's core innovation is its "Dataset Template" approach, which defines and directs how data from different types of experiments should be processed, rather than simply defining data formats.
For example, when researchers upload spreadsheets of X-ray measurements from different sources, the Dataset Template can be configured to interpret them. The system then automatically performs advanced analyses and creates visualizations to provide immediate overviews. Multiple templates can be prepared for different materials research themes, allowing maximum flexibility in data management. Individual researchers can also easily prepare custom templates when necessary. Many templates have already been prepared and shared among users through the system.
"RDE's unique approach allows researchers to freely define data structures tailored to their instruments, while enabling the system to perform massive data structuring and metadata extraction automatically," says Fujima. Since its launch in January 2023, RDE has demonstrated significant scalability with widespread adoption across Japan's materials research community. The system currently has over 5,000 users, with more than 1,900 Dataset Templates for various experimental methods implemented, over 16,000 datasets created, and more than three million data files accumulated.
RDE serves as a data infrastructure for major national initiatives, including the Materials Research DX Platform initiative promoted by Japan's Ministry of Education, Culture, Sports, Science and Technology. To encourage broader use within the research community, the NIMS team has released an open-source software toolkit called RDEToolKit. The research paper detailing the system is available at https://doi.org/10.1080/27660400.2025.2597702, and additional information about the journal can be found at https://www.tandfonline.com/STAM-M.
The development of RDE represents a significant advancement in materials science infrastructure, potentially accelerating discovery processes by reducing data processing burdens and facilitating collaboration. By creating standardized, AI-ready datasets, the system addresses a critical bottleneck in the field's transition toward data-driven research methodologies. This infrastructure development could have far-reaching implications for materials innovation across industries including electronics, energy, transportation, and healthcare, where new materials discoveries often drive technological breakthroughs.


