{"id":301936,"date":"2025-04-30T05:10:00","date_gmt":"2025-04-30T12:10:00","guid":{"rendered":"https:\/\/sftarticles.wpenginepowered.com\/es\/?p=354005"},"modified":"2025-07-01T14:45:59","modified_gmt":"2025-07-01T21:45:59","slug":"perfection-is-the-enemy-of-ai","status":"publish","type":"post","link":"https:\/\/cms-articles.softonic.io\/en\/perfection-is-the-enemy-of-ai\/","title":{"rendered":"Perfection is the enemy of AI"},"content":{"rendered":"\n<p>A research team from the University of Michigan has developed a new collective communication system called OptiReduce, which accelerates artificial intelligence (AI) training and machine learning across multiple cloud servers.<\/p>\n\n\n<p>This innovative system sets time limits for communication between servers, eliminating the need to wait for everyone to complete their tasks, which translates into greater efficiency in processing large models.<\/p>\n\n\n<p><strong>Distributed deep learning requires multiple servers to work together,<\/strong> but congestion and delays are common in cloud computing centers due to the simultaneous load of jobs.<\/p>\n\n\n<figure class=\"wp-block-embed is-type-rich is-provider-twitter wp-block-embed-twitter\"><div class=\"wp-block-embed__wrapper\">\n<blockquote class=\"twitter-tweet\" data-width=\"550\" data-dnt=\"true\"><p lang=\"en\" dir=\"ltr\">Thrilled to share that our work, OptiReduce: A Collective Communication System for the Cloud&#8212;in collaboration with <a href=\"https:\/\/twitter.com\/nvidia?ref_src=twsrc%5Etfw\">@nvidia<\/a> and <a href=\"https:\/\/twitter.com\/Broadcom?ref_src=twsrc%5Etfw\">@broadcom<\/a>&#8212;has been accepted to <a href=\"https:\/\/twitter.com\/hashtag\/NSDI?src=hash&amp;ref_src=twsrc%5Etfw\">#NSDI<\/a> &#39;25! ? Huge kudos to my student, Ertza Warraich, for his persistence and resilience in making this happen!! ?? <a href=\"https:\/\/t.co\/aZSINz838E\">pic.twitter.com\/aZSINz838E<\/a><\/p>&mdash; Muhammad Shahbaz (@msbaz2013) <a href=\"https:\/\/twitter.com\/msbaz2013\/status\/1867105187568959951?ref_src=twsrc%5Etfw\">December 12, 2024<\/a><\/blockquote><script async src=\"https:\/\/platform.twitter.com\/widgets.js\" charset=\"utf-8\"><\/script>\n<\/div><\/figure>\n\n\n<h2 class=\"wp-block-heading\">AI models thrive with the OptiReduce communication method<\/h2>\n\n\n<p><strong>OptiReduce offers a solution by introducing time limits that allow the process to progress without waiting for the slower servers to catch up<\/strong>. This way, a 70% increase in speed to achieve accuracy is achieved compared to Gloo and 30% faster than NCCL in shared cloud environments.<\/p>\n\n\n<p>Although this methodology involves the loss of certain data due to time constraints, OptiReduce uses advanced mathematical techniques to approximate the missing information, thereby minimizing the impact on the final accuracy of the model.<\/p>\n\n\n<figure class=\"wp-block-embed is-type-video is-provider-youtube wp-block-embed-youtube wp-embed-aspect-16-9 wp-has-aspect-ratio\"><div class=\"wp-block-embed__wrapper\">\n<iframe loading=\"lazy\" title=\"&quot;Godfather of AI&quot; has new warning about artificial intelligence: &quot;You should worry&quot;\" width=\"840\" height=\"473\" src=\"https:\/\/www.youtube.com\/embed\/sgSj9mBr0w4?feature=oembed\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" referrerpolicy=\"strict-origin-when-cross-origin\" allowfullscreen><\/iframe>\n<\/div><\/figure>\n\n\n<p>Researchers argue that by accepting &#8220;limited reliability,&#8221; machine learning jobs can run faster without compromising their accuracy.<\/p>\n\n\n<p>In its tests, OptiReduce proved to be significantly more effective compared to existing models, allowing large AI models, such as Llama 4 and Gemini, to be more resilient to data loss.<\/p>\n\n\n<p>The team is also exploring the possibility of moving towards hardware-level solutions to achieve communication speeds of hundreds of Gigabits per second, a step that could further revolutionize cloud processing capabilities.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>A research team from the University of Michigan has developed a new collective communication system called OptiReduce, which accelerates the training of artificial intelligence (AI) and machine learning across multiple cloud servers. This innovative system sets time limits for communication between servers, eliminating the need to wait for everyone to complete their tasks, resulting in greater efficiency in processing large models. Distributed deep learning requires several servers to work together, but congestion and delays are common in cloud computing centers due to [&#8230;]<\/p>\n","protected":false},"author":9317,"featured_media":301945,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":"","wpcf-pageviews":0},"categories":[1015],"tags":[3885],"usertag":[],"vertical":[],"content-category":[6771],"class_list":["post-301936","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-news","tag-inteligencia-artificial","content-category-ai"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/posts\/301936","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/users\/9317"}],"replies":[{"embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/comments?post=301936"}],"version-history":[{"count":1,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/posts\/301936\/revisions"}],"predecessor-version":[{"id":307426,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/posts\/301936\/revisions\/307426"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/media\/301945"}],"wp:attachment":[{"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/media?parent=301936"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/categories?post=301936"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/tags?post=301936"},{"taxonomy":"usertag","embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/usertag?post=301936"},{"taxonomy":"vertical","embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/vertical?post=301936"},{"taxonomy":"content-category","embeddable":true,"href":"https:\/\/cms-articles.softonic.io\/en\/wp-json\/wp\/v2\/content-category?post=301936"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}