الذكاء الاصطناعي التوليدي في عصر "الحقائق البديلة
|
خدمات النشر المفتوح من معهد ماساتشوستس للتكنولوجيا
الأبحاث
Large-scale foundation models, which are pre-trained on massive, un-labeled datasets and subsequently fine-tuned on specific tasks, have recently achieved unparalleled success on a wide array of applications, including in healthcare and biology. In this paper, we explore two foundation models recently developed for single-cell RNA sequencing data, scBERT and scGPT. Focusing on the fine-tuning task of cell type annotation, we explore the relative performance of pre-trained models compared to a simple baseline, L1-regularised logistic regression, including in the few-shot setting. We perform ablation studies to understand whether pre-training improves model performance and to better understand the difficulty of the pre-training task in scBERT. Finally, using scBERT as an example, we demonstrate the potential sensitivity of fine-tuning to hyper-parameter settings and parameter initialisations. Taken together, our results highlight the importance of rigorously testing foundation models against well established baselines, establishing challenging fine-tuning tasks on which to benchmark foundation models, and performing deep introspection into the embeddings learned by the model in order to more effectively harness these models to transform single-cell data analysis. Code is available at https://github.com/clinicalml/sc-foundation-eval.
|
خدمات النشر المفتوح من معهد ماساتشوستس للتكنولوجيا
|
هارفارد بزنس ريفيو الصحافة
|
اركسيف
|
اركسيف
|
bioRxiv
|
الطبيعة
|
اركسيف
|
البنكرياس
|
العلوم
|
أنظمة الخلايا
|
اركسيف
|
الجمعية الإشعاعية لأمريكا الشمالية
|
الطبيعة
|
اركسيف
|
ساينس دايركت
|
PNAS
|
الطبيعة
|
اركسيف
|
مجلة علم الأورام السريري
|
Proceedings of Machine Learning Research
|
Dynamic Ideas
|
العلوم
|
Little, Brown and Company
|
اركسيف
|
Dynamic Ideas
|
Advances in Neural Information Processing Systems
|
International Journal of Computer Vision