Zhipu GLM: train, develop, and deploy your own LLMs

Tue 7 May 3:45 a.m. - 5:15 a.m. PDT @

Expo Talk Panel

Halle A 7 Halle A 7

GLM-4 is a large language model with billions of parameters, capable of processing vast amounts of natural language text data, thereby enabling more accurate and natural language generation and understanding.

Compared with the previous generation, GLM-4 introduces significant improvements on various benchmarks, such as MMLU, GSM8K and MATH. It also supports a context length of 128k tokens and achieves almost 100% accuracy even with lengthy text inputs. The model incorporates GLM-4 All Tools, an intelligent agent feature capable of autonomously understanding and executing complex instructions, enabling interactions with web browsers, code interpreters, and multimodal text-generation models.

Surrounding GLM-4, we have also developed a series of models, forming a relatively complete large model full-stack technology system that covers multimodal, code generation, search enhancement, and dialogue.

Join Virtual Talk & Panel

“Beyond Algorithms: Navigating the Data Deluge in AI”

Wed 8 May 4 a.m. - 4:45 a.m. PDT @

Expo Talk Panel

Stolz 1 Stolz 1

The field of machine learning (ML) is experiencing a paradigm shift. While the focus on innovative algorithms and architecture once dominated and might continue to evolve, the spotlight is now on DATA. Large models are becoming commonplace, and real-world effectiveness is paramount. This necessitates a data-centric approach, encompassing the entire data lifecycle, from collection, cleansing, orchestration to supply, to satisfy the hunger of the humongous and ever growing models.

This panel discussion will delve into the industry challenges associated with data efficiency. We will explore: * The rise of data-centricity: Moving beyond algorithms to prioritize data quality, management, and utilization. * Challenges of large models and real-world application: Ensuring data is sufficient, representative, and addresses real-world complexities. * Data lifecycle considerations: Optimizing data collection, storage, transformation, and integration for robust AI systems.

By fostering open dialogue on these critical challenges and opportunities, this panel discussion aims to propel the field of data-centric AI towards a future of responsible, impactful, and collaborative innovation.

Join Virtual Talk & Panel

Advances in private training for production on-device language models

Thu 9 May 4:15 a.m. - 5:15 a.m. PDT @

Expo Talk Panel

Halle A 7 Halle A 7

We discuss how years of research advances now power the private training of Gboard LMs, since the proof-of-concept development of federated learning (FL) in 2017 and formal differential privacy (DP) guarantees in 2022. FL enables mobile phones to collaboratively learn a model while keeping all the training data on device, and DP provides a quantifiable measure of data anonymization. Formally, DP is often characterized by (ε, δ) with smaller values representing stronger guarantees. Machine learning (ML) models are considered to have reasonable DP guarantees for ε=10 and strong DP guarantees for ε=1 when δ is small. As of today, all NWP neural network LMs in Gboard are trained with FL with formal DP guarantees, and all future launches of Gboard LMs trained on user data require DP. These 30+ Gboard on-device LMs are launched in 7+ languages and 15+ countries, and satisfy (ɛ, δ)-DP guarantees of small δ of 10-10 and ɛ between 0.994 and 13.69. To the best of our knowledge, this is the largest known deployment of user-level DP in production at Google or anywhere, and the first time a strong DP guarantee of ɛ < 1 is announced for models trained directly on user data.

Join Virtual Talk & Panel

Trustworthy human evaluations for Large Language Models

Fri 10 May 3:45 a.m. - 5:15 a.m. PDT @

Expo Talk Panel

Halle A 7 Halle A 7

Reliable evaluations are critical for improving language models, but they're difficult to achieve. Traditional automated benchmarks often fail to reflect real-world settings, and open source evaluation sets are empirically overfitted. Conducting evaluations in-house is burdensome and demands significant human effort from model builders.

To tackle these issues, Scale AI has created a set of evaluation prompt datasets in areas like instruction following, coding, math, multilinguality, and safety. Summer Yue, Chief of Staff, AI; Director of Safety and Standards at Scale AI will discuss these eval sets, as well as the launch of a new platform which allows researchers to gain insights into their models' performance. Furthermore, she will introduce a unique feature which warns developers of potential overfitting on these sets.

Join Virtual Talk & Panel

SPONSOR EXPO | May 7th

Welcome To The ICLR Sponsor Expo!

Expo Schedule

TALKS & PANELS

Zhipu GLM: train, develop, and deploy your own LLMs

Tue 7 May 3:45 a.m. - 5:15 a.m. PDT @

“Beyond Algorithms: Navigating the Data Deluge in AI”

Wed 8 May 4 a.m. - 4:45 a.m. PDT @

Advances in private training for production on-device language models

Thu 9 May 4:15 a.m. - 5:15 a.m. PDT @

Trustworthy human evaluations for Large Language Models

Fri 10 May 3:45 a.m. - 5:15 a.m. PDT @