军事医学2025,Vol.49Issue(1):41-46,6.DOI:10.7644/j.issn.1674-9960.2025.01.007
毒物药物化学分子质谱信息数据库的设计与实现
Design and establishment of a database for toxins and molecular mass spectra of drugs
李雪萌 1李梦凡 1马俊杰 1徐斌 1杜洁 1尤巍 1陈佳 1谢剑炜 1赵东升1
作者信息
- 1. 军事科学院军事医学研究院,北京 100850
- 折叠
摘要
Abstract
Objective To construct a database for molecular mass spectra of toxins and drugs in order to facilitate the management and retrieval of mass spectra for nerve agents,metabolites and other small molecules.Methods Requirement analysis and functional design were performed using software engineering methods.The Spec2Vec algorithm was used for vector representation of mass spectra,while SMILES molecular structures were vectorized using the extended connectivity fingerprint(ECFP).A data storage model integrating structured information and vector representations was established using the Milvus database.Similarity search of mass spectra and molecular structures was conducted via vector similarity comparison and the FlashEntropySearch algorithm.Results The constructed database of mass spectra encompassed over 400,000 entries from such sources as OCAD,NIST,MASSBANK,metabolic products,and natural products of TCM,which was capable of searching for similarities in mass spectra and molecular structures.On a standard server,similarity search of mass spectra took no more than 5 seconds,while that of molecular structures took no more than 1 second.Conclusion The system enables efficient management of complex mass spectra and provides rapid retrieval and comparison of mass spectra-related information through advanced vector indexing technology,offering robust data support and research tools for toxicology and pharmacology.关键词
Milvus向量数据库/质谱信息/质谱相似性检索/分子结构相似性检索/FlashEntropySearch算法Key words
Milvus vector database/mass spectra information/mass spectra similarity retrieval/molecular structure similarity search/FlashEntropySearch algorithm分类
军事科技引用本文复制引用
李雪萌,李梦凡,马俊杰,徐斌,杜洁,尤巍,陈佳,谢剑炜,赵东升..毒物药物化学分子质谱信息数据库的设计与实现[J].军事医学,2025,49(1):41-46,6.