多模态 AI
2Be Like Clippy(be-clippy.com)💡 The content focuses on a social movement advocating for transparent, user-friendly practices in AI-related tech companies, addressing data exploitation issues which fall under societal impact of AI. 3Show HN: OCR Arena – A playground for OCR models(ocrarena.ai)💡 The story focuses on OCR Arena, a playground for exploring and comparing OCR model capabilities—this aligns with the Models category's scope of model comparisons and practical capability demonstrations. 8AI Mafia Network – An interactive visualization(dipakwani.com)💡 The story focuses on an interactive visualization of the AI Mafia Network, which is a social discussion/exploration of connections within the AI community, fitting the 'society' category under social aspects of the ecosystem. 10How AI hears accents: An audible visualization of accent clusters(accent-explorer.boldvoice.com)💡 The article details BoldVoice's accent identifier model, including its architecture (fine-tuned HuBERT), training process, and capability to cluster accents in latent space—aligning with the 'models' category (model details and capabilities). 12Introduction to Multi-Armed Bandits (2019)(arxiv.org)💡 This is an academic ArXiv paper on multi-armed bandits, a fundamental machine learning framework for decision-making under uncertainty, which falls into the research category. 18MCP Gateway and Registry(github.com)💡 The story involves IBM's MCP Gateway and Registry, and MCP (Multi-Context Processing) is explicitly listed under the agents category as part of agentic systems and tool use. 19How can AI ID a cat?(quantamagazine.org)💡 The story explains the underlying algorithms and research behind how AI identifies cats, focusing on computer vision and image classification principles, which aligns with the 'research' category. 23FastVLM: Efficient Vision Encoding for Vision Language Models(machinelearning.apple.com)💡 The story presents FastVLM, an efficient vision encoding method for vision language models, hosted on Apple's machine learning research page, aligning with the research category for AI algorithms and technical advancements. 24Multiplatform Matrix Multiplication Kernels(burn.dev)💡 The story focuses on multiplatform matrix multiplication kernels, which are fundamental components enabling efficient AI model inference and training, aligning with the 'infra' category under Engineering. 27Show HN: Cactus – Ollama for Smartphones(github.com)💡 The story introduces Cactus, a cross-platform framework for deploying LLMs, VLMs, Embedding Models, and TTS locally on smartphones—this aligns with the 'infra' category which focuses on deployment, inference, and edge device AI. 第 1 / 4 页,共 104 条