IBM Booth Demos & Staff @ NeurIPS 2024
Report abuse
Use this data
Sign up for free
IBM BOOTH SCHEDULE
Day + Date
12/10 Tue
Count8
1
Tue 12/10 | 12:00 pm - 1:00
2
Tue 12/10 | 1:00 pm - 2:00
3
Tue 12/10 | 2:00 pm - 3:00
4
Tue 12/10 | 3:00 pm - 4:00
5
Tue 12/10 | 4:00 pm - 5:00
6
Tue 12/10 | 5:00 pm - 6:00
7
Tue 12/10 | 6:00 pm - 7:00
8
Tue 12/10 | 7:00 pm - 8:00
Drag to adjust the number of frozen columns
Demo Title
Demo Presenter
Demo Abstract
STAFF: IBM Booth (Researcher)
Automatically Deploying a Sequence-to-Sequence Transformer for Accelerated Discovery on the IBM HERMES Project Chip
Julian Büchel
Analog in-memory computing (AIMC) using resistive memory devices has the potential to increase the energy efficiency of deep neural network inference by multiple orders of magnitude. This is enabled by performing matrix vector multiplications – one of the key operations in deep neural network inference – directly within the memory, avoiding expensive weight fetching from external memory such as DRAM. The IBM HERMES Project Chip is a state-of-the-art, 64-core mixed-signal AIMC chip based on Phase Change Memory that makes this concept a reality. Using this chip, we demonstrate automatic deployment and inference of a Transformer model capable of predicting chemical compounds that are formed in a chemical reaction.
Giacomo Camposampiero
Benjamin Hoover
Sivan Doveh
Kristjan Greenewald
Karthikeyan Natesan Ramamurthy
Vijay Ekambaram
Nandhini Chandramoorthy
Naigang Wang
EvalAssist - An LLM-as-a-Judge Framework
Werner Geyer
Evaluation of large language model (LLM) outputs requires users to make critical judgments about the best outputs across various configurations. This process is costly and takes time given the large amounts of data. LLMs are increasingly used as evaluators to filter training data, evaluate model performance, detect harms and risks, or assist human evaluators with detailed assessments. To support this process, effective front-end tools are critical for evaluation. EvalAssist abstracts the llm-as-a-judge evaluation process into a library of parameterize-able evaluators (the criterion being the parameter), allowing the user to focus on criteria definition. EvalAssist consists of a web-based user experience, an API, and a Python toolkit and is based on the UNITXT open-source library. The user interface provides users with a convenient way of iteratively testing and refining LLM-as-a-judge criteria, and supports both direct (rubric-based) and pairwise assessment paradigms, the two most prevalent forms of LLM-as-a-judge evaluation available. In our demo, we will showcase different types of evaluator LLMs for general purpose evaluation and also the latest Granite Guardian model (released October 2024) to evaluate harms and risks.
Giacomo Camposampiero
Sivan Doveh
Imran Nasim
Kristjan Greenewald
Dmitry Katz
Hendrik Strobelt
IBM Granite Vision Model for Enterprise AI
Leonid Karlinsky
Enterprise applications present unique challenges for vision and language foundation models, as they frequently involve visual data that diverges significantly from the typical distribution of web images and require understanding of nuanced details such as small text in scanned documents, or tiny defects in industrial equipment images. Motivated by these challenges, we will showcase our IBM Granite Vision model, a foundation model with state-of-the-art performance in document image understanding tasks, such as the analysis of charts, plots, infographics, tables, flow diagrams, and more. We will provide a detailed overview of our methodology and present a live demonstration of our model's capabilities, illustrating its key features and applications. Our model will be open-sourced, allowing the community to access and contribute to its development.
Giacomo Camposampiero
Sivan Doveh
Michael Katz
Hendrik Strobelt
Vadim Elisseev
Dave Braines
Low-latency, High energy-efficiency inference with IBM AIU NorthPole
David Cox
The IBM NorthPole AIU (artificial intelligence unit) is a unique hardware accelerator for AI inference, utilizing a non-von-Neumann architecture to achieve unprecedented inference throughput, low latency, and energy/cost efficiency. While NorthPole was originally designed for edge inference applications, its unique architecture makes it especially attractive for inference applications with decoder-only LLMs. Here we demonstrate the unique strengths of NorthPole using a customized version of IBM's Granite 3.0-series LLM deployed on a multi-NorthPole system, showing how it can achieve blazing fast speeds and high throughput with sub-millisecond latency. We also demonstrate how Northpole's latency advantage can be used in an inference scaling regime, where multiple chains of sequential inference calls can be used to achieve superior results on a task.
Benjamin Hoover
Alexander Andreopoulos
Indra Priyadarsini
Yannis Belkhiter
Marvin Alberts
Djallel Bouneffouf
Dmitry Katz
Low-latency, High energy-efficiency inference with IBM AIU NorthPole
David Cox
The IBM NorthPole AIU (artificial intelligence unit) is a unique hardware accelerator for AI inference, utilizing a non-von-Neumann architecture to achieve unprecedented inference throughput, low latency, and energy/cost efficiency. While NorthPole was originally designed for edge inference applications, its unique architecture makes it especially attractive for inference applications with decoder-only LLMs. Here we demonstrate the unique strengths of NorthPole using a customized version of IBM's Granite 3.0-series LLM deployed on a multi-NorthPole system, showing how it can achieve blazing fast speeds and high throughput with sub-millisecond latency. We also demonstrate how Northpole's latency advantage can be used in an inference scaling regime, where multiple chains of sequential inference calls can be used to achieve superior results on a task.
Benjamin Hoover
Dennis Wei
Alexander Andreopoulos
Yannis Belkhiter
Debarun Bhattacharjya
Djallel Bouneffouf
Naigang Wang
Dave Braines
ProLLM: Program analysis driven, LLM assisted application modernization
Atin Sood
Large Language Models for Code (or code LLMs) are increasingly gaining popularity and capabilities, offering a wide array of application modernization use cases such as code explanation, test generation, code repair, refactoring, translation, code generation, code completion and more. To leverage code LLMs to their full potential, developers must provide code-specific contextual information to the models. We would like to demonstrate generic pipelines we built, that incorporate static analysis to guide LLMs in generating code explanation at various levels (application, method, class) and automated test generation to produce compilable, high-coverage and natural looking test cases. We will also demonstrate how these pipelines can be built using “codellm-devkit”, an open-source library that significantly simplifies the process of performing program analysis at various levels of granularity, by making it easier to integrate detailed, code-specific insights that enhance the operational efficiency and effectiveness of LLMs in coding tasks. And how these use cases can be extended to different programming languages, specifically Java and Python.
Trey Tinnell
Mikhail Yurochkin
Kristjan Greenewald
Tomoya Sakai
Djallel Bouneffouf
Luis Lastras
Enara Vijil
Indra Priyadarsini
IBM AI Agent SWE-1.0 (with open LLMs)
Martin Hirzel
Resolving issues from an issue tracker on a source-code repository is tedious and expensive when done by hand. Recently, the SWE-bench Lite leaderboard has seen submissions by several LLM-based agents that do this automatically. Unfortunately, these agents rely on closed-source frontier models, making them expensive and raising data-sharing concerns for industrial use. In contrast, we built IBM AI Agent SWE-1.0, which works with a variety of open-source models such as Llama, Granite, and Mistral. SWE-1.0 uses sub-agents that are specialized for sub-tasks of localization, editing, and testing. Each sub-task is within reach of the capabilities of an open-source model. Furthermore, SWE-1.0 uses automated checking and repair of various common mistakes made by models, uses structured formats for data passed between sub-agents, and uses ensembling at multiple levels. Overall, SWE-1.0 has issue resolution rates approaching those of closed-source frontier models but with open-source models.
Karthikeyan Natesan Ramamurthy
Radu Marinescu
Bo Wen
Tomoya Sakai
Nima Dehmamy
Dmitry Katz
Vadim Elisseev
Emilio Vital Brazil
Site Reliability Engineering Agent-101 for Incident Management
Rohan Arora
IT failures are increasingly costly, with even brief outages leading to millions in losses as more business moves online. Incident management has become more complex than ever due to a combination of technological advancements, infrastructure heterogeneity, and evolving business needs. Resolving IT incidents is similar if not more complex to software code bug fixing. It is a very tedious and expensive task. Several advancements have been made including IBM’s Intelligent Incident Remediation using LLMs and generative AI to streamline incident resolution by identifying probable causes and using AI-guided remediation steps. In this demo, we are describing how we are advancing the state of the art in incident remediation using agentic Gen AI approaches. We demonstrate SRE-Agent-101, a ReAct style LLM-based agent, along with a benchmark to standardize the effectiveness of analytical solutions for incident management. SRE-Agent-101 uses several custom built tools, namely anomaly detection, causal topology extraction, NL2Traces, NL2Metrics, NL2Logs, NL2TopologyTraversal, and NL2Kubectl. These tools take natural language as input to fetch target data gathered by the observability stack. Given the verbosity of such data, even powerful models can quickly exhaust their context length. We have implemented a methodology to dynamically discover the more specific context using domain knowledge. The target context is then analyzed by underlying LLM to infer the root cause entity, fault, perform actions and this process iteratively continues until the incident is resolved.
Dennis Wei
Karthikeyan Natesan Ramamurthy
Bo Wen
Nima Dehmamy
Lisa Hamada
Vadim Elisseev
23 records

Alert

Lorem ipsum
Okay