{"id":168450,"date":"2021-06-09T15:15:30","date_gmt":"2021-06-09T10:15:30","guid":{"rendered":"https:\/\/venturebeat.com\/?p=2694904"},"modified":"2021-06-09T15:15:30","modified_gmt":"2021-06-09T10:15:30","slug":"facebook-proposes-nethack-as-a-grand-challenge-in-ai-research","status":"publish","type":"post","link":"https:\/\/www.technologyforyou.org\/facebook-proposes-nethack-as-a-grand-challenge-in-ai-research\/","title":{"rendered":"Facebook proposes NetHack as a grand challenge in AI research"},"content":{"rendered":"<div id=\"boilerplate_2682874\" class=\"post-boilerplate boilerplate-before\">\n<p><em>Elevate your enterprise data technology and strategy at <a href=\"https:\/\/venturebeat.com\/event\/transform-2021\/register\/#\" data-type=\"URL\" target=\"_blank\" rel=\"noreferrer noopener\">Transform 2021<\/a><\/em>. <\/p>\n<hr class=\"wp-block-separator is-style-wide\">\n<\/div>\n<p>Facebook today proposed NetHack as a grand challenge for AI research, for which the company is launching a competition at the NeurIPS 2021 AI conference in Sydney, Australia. It\u2019s Facebook\u2019s assertion that NetHack, an \u201980s video game with simple visuals that\u2019s considered among the hardest in the world, can enable data scientists to benchmark state-of-the-art AI methods in a complex environment without the need to run experiments on a powerful computer.<\/p>\n<p>Games have served as AI benchmarks for AI for decades, but things really kicked into gear in 2013 \u2014 the year Google\u2019s DeepMind demonstrated a system that could play Pong, Breakout, Space Invaders, Seaquest, Beamrider, Enduro, and Q*bert at superhuman levels. The advancements aren\u2019t merely improving game design, according to experts like DeepMind cofounder Demis Hassabis. Rather, they\u2019re informing the development of systems that might one day diagnose illnesses, predict complicated <a href=\"https:\/\/venturebeat.com\/2018\/12\/03\/deepminds-alphafold-wins-casp13-protein-folding-competition\/\">protein structures<\/a>, and&nbsp;<a href=\"https:\/\/venturebeat.com\/2018\/09\/13\/googles-deepmind-ai-gains-on-human-oncologists-in-planning-radiation-cancer-treatments\/\">segment CT scans<\/a>.<\/p>\n<p>In particular, <a href=\"https:\/\/venturebeat.com\/2021\/02\/23\/how-reinforcement-learning-chooses-the-ads-you-see\/\">reinforcement learning<\/a> \u2014 a type of AI that can learn strategies to orchestrate large systems like manufacturing plants, traffic control systems, financial portfolios, and robots \u2014 is transitioning from research labs to highly impactful, real-world applications. For example, self-driving car companies like <a href=\"https:\/\/wayve.ai\/blog\/learning-to-drive-in-a-day-with-reinforcement-learning\/\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/wayve.ai\/blog\/learning-to-drive-in-a-day-with-reinforcement-learning\/&amp;source=gmail&amp;ust=1616774515200000&amp;usg=AFQjCNHZGK5d2W6CU1jZ2muTV8v7QPbvMA\">Wayve<\/a>&nbsp;and&nbsp;<a href=\"https:\/\/heartbeat.fritz.ai\/how-googles-self-driving-cars-work-c77e4126f6e7\" target=\"_blank\" rel=\"noopener\" data-saferedirecturl=\"https:\/\/www.google.com\/url?q=https:\/\/heartbeat.fritz.ai\/how-googles-self-driving-cars-work-c77e4126f6e7&amp;source=gmail&amp;ust=1616774515200000&amp;usg=AFQjCNF09JQvWoc4M9vQIIEU5ZCszMlnXQ\">Waymo<\/a> are using reinforcement learning to develop the control systems for their cars. And via Microsoft\u2019s <a href=\"https:\/\/venturebeat.com\/2020\/05\/19\/microsoft-launches-project-bonsai-an-ai-development-platform-for-industrial-systems\/\">Bonsai<\/a>, Siemens is employing reinforcement learning to calibrate its CNC machines.<\/p>\n<p>\u201cRecent advances in reinforcement learning have been fueled by simulation environments such as games like StarCraft II, Dota 2, or Minecraft. However, this progress came at substantial computational costs, often requiring running thousands of GPUs in parallel for a single experiment, while also falling short of leading to \u2026 methods that can be transferred to more real-world problems outside of these games,\u201d Facebook AI researchers Edward Grefenstette, Tim Rockt\u00e4schel, and Eric Hambro wrote in a blog post. \u201cWe need environments that are complex, highlighting shortcomings of RL, while also allowing extremely fast simulation at low computation costs.\u201d<\/p>\n<h2>NetHack<\/h2>\n<p>Facebook\u2019s proposal follows the release of the company\u2019s <a href=\"https:\/\/venturebeat.com\/2020\/06\/25\/facebook-releases-ai-development-tool-based-on-nethack\/\">NetHack Learning Environment (NHLE)<\/a>, a research tool based on the original NetHack. (The NetHack Challenge is in turn based on the NHLE.) NetHack, which was first released in 1987, tasks players with descending more than 50 dungeon levels to retrieve a magical amulet, during which they must use wands, weapons, armors, potions, spellbooks, and other items and fight monsters. Levels in NetHack are procedurally generated and every game is different, which the Facebook researchers note tests the generalization limits of leading AI.<\/p>\n<p>\u201cWinning a game of NetHack requires long term planning in an incredibly unforgiving environment. Once a player\u2019s character dies \u2026 the game starts from scratch in an entirely new dungeon,\u201d Grefenstette, Rockt\u00e4schel, and Hambro continued. \u201cSuccessfully completing the game as an expert player takes on average 25 to 50 times more steps than an average StarCraft II game, and players\u2019 interactions with objects and the environment are extremely complex, so success often hinges on calling upon imagination to solve problems in creative or surprising ways as well as consulting external knowledge sources [such as] the official <a class=\"_8xc5 _8y8i _8x97 _8w61\" href=\"http:\/\/www.nethack.org\/download\/3.6.5\/nethack-365-Guidebook.pdf?fbclid=IwAR206hmcBK5jQK5xs62ofrzUJU9UuHtH-bFvL1mu1ptdB7HHS87rrUI_xCM\" target=\"_blank\" rel=\"noopener nofollow noreferrer\" data-ms=\"{&quot;creative&quot;:&quot;link&quot;,&quot;creative_detail&quot;:&quot;link&quot;,&quot;create_type&quot;:&quot;link&quot;,&quot;create_type_detail&quot;:&quot;link&quot;}\" data-lynx-mode=\"async\" data-lynx-uri=\"https:\/\/l.facebook.com\/l.php?u=http%3A%2F%2Fwww.nethack.org%2Fdownload%2F3.6.5%2Fnethack-365-Guidebook.pdf%3Ffbclid%3DIwAR206hmcBK5jQK5xs62ofrzUJU9UuHtH-bFvL1mu1ptdB7HHS87rrUI_xCM&amp;h=AT08WlOL06w-X9roRMsKsIs57eSAFnwye34SrGMjEI2ENSU6xSqqacPk1n7ZFZ7OJx2251-Fg3i68JlvAvv1skjlMHqC2tOJyBBjcMjavretNOfvOSzIK_-Ma1nh_vdRDCZRDYl7hhoCBMA4\">NetHack Guidebook<\/a>, the&nbsp;<a class=\"_8xc5 _8y8i _8x97 _8w61\" href=\"https:\/\/nethackwiki.com\/?fbclid=IwAR2nt4Z6J_lVCO9SV3Je4LBoVzm0xrz2X7uQOFbdKLL-nb6QTnKETeEPgoI\" target=\"_blank\" rel=\"noopener nofollow noreferrer\" data-ms=\"{&quot;creative&quot;:&quot;link&quot;,&quot;creative_detail&quot;:&quot;link&quot;,&quot;create_type&quot;:&quot;link&quot;,&quot;create_type_detail&quot;:&quot;link&quot;}\" data-lynx-mode=\"async\" data-lynx-uri=\"https:\/\/l.facebook.com\/l.php?u=https%3A%2F%2Fnethackwiki.com%2F%3Ffbclid%3DIwAR2nt4Z6J_lVCO9SV3Je4LBoVzm0xrz2X7uQOFbdKLL-nb6QTnKETeEPgoI&amp;h=AT26BfSKpb7CQ9MI35P1HbWHZ_VHeMPIfIJ28t7xJgliYw-6NgmgNc80cCVAjQiz1ezoX5lWA3EIYWinUAYnUBqpV7j5X6KB_mf2-z4IiM9DXa77EPmMKB8RShFh8oIqxfK97n3td4bHxH8Z\">NetHack Wiki<\/a>, and online videos and forum discussions].\u201d<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-large wp-image-2614874 aligncenter\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2020\/06\/example_run.gif?w=770&amp;resize=770%2C434&amp;strip=all\" alt=\"Facebook NetHack Learning Environment\" width=\"770\" height=\"434\" data-recalc-dims=\"1\"><\/p>\n<p>Partial observation makes exploration in NetHack essential, and procedural generation and \u201cpermadeath\u201d make the cost of failure significant. And AI can\u2019t reset or interfere with the environment, making the methods that underpin systems like DeepMind\u2019s <a href=\"https:\/\/venturebeat.com\/2018\/12\/06\/google-deepmind-alphazero-chess-shogi-go\/\">AlphaZero<\/a> for StarCraft II or Uber\u2019s <a href=\"https:\/\/venturebeat.com\/2018\/11\/26\/uber-ai-reliably-completes-all-stages-in-montezumas-revenge\/\">Go-Explore<\/a> for Montezuma\u2019s Revenge impossible.<\/p>\n<p>\u201c[The challenges in NetHack] range from randomized mazes to more structured challenges, like large rooms full of monsters and traps, towns and forts, and hazards such as kraken-infested waters,\u201d Grefenstette, Rockt\u00e4schel, and Hambro said. \u201cNew ways of dealing with the ever changing observations in a stochastic and rich game world calls for the development of techniques that have a better chance of scaling to real-world settings with high degrees of variability.\u201d<\/p>\n<h2>Lightweight<\/h2>\n<p>NetHack has another advantage in its lightweight architecture. A turn-based, ASCII-art world and a game engine written primarily in C captures its complexity. NetHack forgoes all but the simplest physics while rendering symbols instead of pixels, importantly, allowing AI to learn quickly without wasting computational resources on simulating dynamics or rendering observations.<\/p>\n<p>Indeed, training&nbsp;sophisticated machine learning models in the cloud remains prohibitively expensive. According to a&nbsp;<a href=\"https:\/\/medium.com\/syncedreview\/the-staggering-cost-of-training-sota-ai-models-e329e80fa82\">recent Synced report<\/a>, the University of Washington\u2019s Grover, which is tailored for both the generation and detection of fake news, cost $25,000 to train over the course of two weeks. OpenAI racked up $256 per hour to train its <a href=\"https:\/\/venturebeat.com\/2019\/08\/20\/openai-releases-curtailed-version-of-gpt-2-language-model\/\">GPT-2<\/a>&nbsp;language model, and Google spent an estimated $6,912 training&nbsp;<a href=\"https:\/\/venturebeat.com\/2018\/11\/02\/google-open-sources-bert-a-state-of-the-art-training-technique-for-natural-language-processing\/\">BERT<\/a>, a bidirectional transformer model that redefined the state of the art for 11 natural language processing tasks.<\/p>\n<p class=\"_8w6f _8w61 _8w6h\">By contrast, a single high-end graphics card is sufficient to train AI-driven NetHack agents hundreds of millions of steps a day using the TorchBeast framework, which supports further scaling by adding more graphics cards or machines. Agents can experience billions of steps in the environment in a reasonable time frame while still challenging the limits of what current techniques can achieve.<\/p>\n<p><img decoding=\"async\" loading=\"lazy\" class=\"size-large wp-image-2614873 aligncenter\" src=\"https:\/\/venturebeat.com\/wp-content\/uploads\/2020\/06\/103413844_278712716658214_8988993104994680911_n.png?w=800&amp;resize=800%2C509&amp;strip=all\" alt=\"Facebook NetHack Learning Environment\" width=\"800\" height=\"509\" data-recalc-dims=\"1\"><\/p>\n<p>\u201c[The NHLE] can train reinforcement learning agents \u202615 times faster than even decade-old Atari benchmark[s]. Furthermore, NetHack can be used to test the limits of even more recent state-of-the-art deep reinforcement learning methods while running 50 to 100 times faster than challenges of comparable difficulty while providing a higher degree of complexity.\u201d<\/p>\n<h2>Challenge<\/h2>\n<p>The NHLE consists of three components: a Python interface to NetHack using the popular OpenAI Gym API, a suite of benchmark tasks, and a baseline machine learning agent. To beat the NetHack Challenge, entrants must develop AI that can reliably either win at NetHack or achieve as high a score as possible. In doing so, the competition aims to yield a head-to-head comparison of different methods and new benchmarks for future research, while at the same time showcasing the suitability of the NHLE as a setting for research.<\/p>\n<p>There won\u2019t be restrictions on how the systems can be trained for the NetHack Challenge, Facebook says \u2014 participants are welcome to use techniques besides machine learning if they choose. Awards will be given for (1) the best overall AI system, (2) the best AI system not using a <a href=\"https:\/\/venturebeat.com\/2021\/05\/25\/the-business-value-of-neural-networks\/\">neural network<\/a>, and (3) the best AI system from an academic or independent team.<\/p>\n<p>Grefenstette, Rockt\u00e4schel, and Hambro say that achieving these objectives will lay the groundwork for follow-up competitions focused on specific aspects of AI. Moreover, the NetHack Challenge might help bring light to classes of training methods and modeling approaches capable of dealing with highly varied environments and a high cost of errors, like having to restart from scratch if a character is killed by a creature.<\/p>\n<p>\u201cMany real-world and industrial problems \u2014 navigation, for example \u2014 share these characteristics. Consequently, making progress in NetHack is making progress toward reinforcement learning in a wider range of applications,\u201d Grefenstette, Rockt\u00e4schel, and Hambro said.<\/p>\n<p>Facebook\u2019s NeurIPS 2021 NetHack Challenge will be conducted in partnership with co-organizer <a href=\"https:\/\/www.aicrowd.com\/\">AIcrowd<\/a>, and it\u2019ll run from early June through October. The winners will be announced at NeurIPS in December.<\/p>\n<div id=\"boilerplate_2663995\" class=\"post-boilerplate boilerplate-after\">\n<h3>GamesBeat<\/h3>\n<p><span>GamesBeat&#8217;s creed when covering the game industry is &#8220;where passion meets business.&#8221; What does this mean? We want to tell you how the news matters to you &#8212; not just as a decision-maker at a game studio, but also as a fan of games. Whether you read our articles, listen to our podcasts, or watch our videos, GamesBeat will help you learn about the industry and enjoy engaging with it.<\/span> <span>How will you do that? Membership includes access to:<\/span><\/p>\n<ul>\n<li><span>Newsletters, such as DeanBeat<\/span><\/li>\n<li><span>The wonderful, educational, and fun speakers at our events<\/span><\/li>\n<li><span>Networking opportunities<\/span><\/li>\n<li><span>Special members-only interviews, chats, and &#8220;open office&#8221; events with GamesBeat staff<\/span><\/li>\n<li><span>Chatting with community members, GamesBeat staff, and other guests in our Discord<\/span><\/li>\n<li><span>And maybe even a fun prize or two<\/span><\/li>\n<li><span>Introductions to like-minded parties<\/span><\/li>\n<\/ul>\n<p><a class=\"membership-link\" href=\"https:\/\/venturebeat.com\/gamesbeat-membership-plans\/\">Become a member<\/a><\/div>\n<p><!-- Boilerplate CSS for \"after\" --> <a href=\"http:\/\/feedproxy.google.com\/~r\/venturebeat\/SZYF\/~3\/IpKqPzG9TBs\/\">Source Link<\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Elevate your enterprise data technology and strategy at Transform 2021. Facebook today proposed NetHack as a grand challenge for AI research, for which the company is launching a competition at the NeurIPS 2021 AI conference in Sydney, Australia. It\u2019s Facebook\u2019s assertion that NetHack, an \u201980s video game with simple visuals that\u2019s considered among the hardest [&hellip;]<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[27765,27766,14083],"tags":[20,37,73,16507,16830,16413,277,76,32202,16420,16676,22830,15107],"class_list":{"0":"post-168450","1":"post","2":"type-post","3":"status-publish","4":"format-standard","6":"category-artificial-intelligence-news","7":"category-machine-learning-news","8":"category-technology-industry-news","9":"tag-ai","10":"tag-artificial-intelligence","11":"tag-big-data","12":"tag-category-games","13":"tag-category-online-communities-virtual-worlds","14":"tag-dev","15":"tag-facebook","16":"tag-machine-learning","17":"tag-nethack","18":"tag-pc-gaming","19":"tag-reinforcement-learning","20":"tag-vb-home-page","21":"tag-video-games"},"_links":{"self":[{"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/posts\/168450","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/comments?post=168450"}],"version-history":[{"count":0,"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/posts\/168450\/revisions"}],"wp:attachment":[{"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/media?parent=168450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/categories?post=168450"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.technologyforyou.org\/wp-json\/wp\/v2\/tags?post=168450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}