欢迎来到尧图网

客户服务 关于我们

您的位置:首页 > 房产 > 建筑 > VisionFM

VisionFM

2025/2/23 0:41:51 来源:https://blog.csdn.net/weixin_37707670/article/details/141159451  浏览:    关键词:VisionFM

VisionFM: a Multi-Modal Multi-Task Vision Foundation Model for Generalist Ophthalmic Artificial Intelligence

    • 阅读感受:

Recently, AI foundation models (FMs), such as GPT-417 and SAM18, have emerged and has the potential to transform many research and industrial domains19, 20. FMs are models trained with a broad range of data and can be later adapted to solve a wider (rather than narrow) spectrum of tasks with their generalist intelligence, providing new opportunities to tackle the growing global ophthalmic challenges in a much more efficient, adaptable and scalable solution.

最近,GPT-417 和 SAM18 等 AI 基础模型 (FM) 应运而生,并有可能改变许多研究和工业领域 19, 20。FM 是使用广泛数据训练的模型,以后可以利用其通用智能来解决更广泛(而非狭窄)的任务范围,从而为以更高效、适应性更强和可扩展的解决方案应对日益增长的全球眼科挑战提供新的机会 21

Albeit impressive, RETFound is still limited in the number of ophthalmic modalities it can process, i.e., only fundus photography and optical coherence tomography (OCT), the spectrum of clinical tasks it excels, i.e., mainly ocular disease diagnosis and prognosis, as well as prediction of systemic diseases. In diagnosing diseases, RETFound still relies on modality-specific classifiers, which is inefficient when generalizing to a broader range of ophthalmic image modalities.

尽管令人印象深刻,RETFound 仍然受到其可处理的眼科模式数量(即仅限眼底照相和光学相干断层扫描 (OCT))以及其擅长的临床任务范围(即主要是眼部疾病的诊断和预后以及全身性疾病的预测)的限制。

阅读感受:

不同前人的方法,该模型的特点聚焦于多模态和多任务,技能够处理多模态和多任务的模型。
但是阅读之后,从技术上来说,精度是不是最高,显然不是的,因为对比的方法都比较久远了,例如UNet比较。因此该论文使用了大数据和模型设计的新思路,倒不是技术多么新。

版权声明:

本网仅为发布的内容提供存储空间,不对发表、转载的内容提供任何形式的保证。凡本网注明“来源:XXX网络”的作品,均转载自其它媒体,著作权归作者所有,商业转载请联系作者获得授权,非商业转载请注明出处。

我们尊重并感谢每一位作者,均已注明文章来源和作者。如因作品内容、版权或其它问题,请及时与我们联系,联系邮箱:809451989@qq.com,投稿邮箱:809451989@qq.com