Affinity Posters
Blog Track Session 6
David Dobre · Leo Schwinn · Claire Vernade · Charlie Gauthier · Fabian Pedregosa · Gauthier Gidel
Halle B
Live content is unavailable. Log in and register to view live content
Schedule
Thu 7:30 a.m. - 9:30 a.m.
|
The Hidden Convex Optimization Landscape of Two-Layer ReLU Networks
(
Poster
#3
)
>
link
Poster Location: Halle B #3 In this article, we delve into the research paper titled 'The Hidden Convex Optimization Landscape of Regularized Two-Layer ReLU Networks'. We put our focus on the significance of this study and evaluate its relevance in the current landscape of the theory of machine learning. This paper describes how solving a convex problem can directly give the solution to the highly non-convex problem that is optimizing a two-layer ReLU Network. After giving some intuition on the proof through a few examples, we will observe the limits of this model as we might not yet be able to throw away the non-convex problem. |
Victor MercklĂ© · Franck Iutzeler · Ievgen Redko 🔗 |
Thu 7:30 a.m. - 9:30 a.m.
|
RLHF without RL
(
Poster
#2
)
>
link
Poster Location: Halle B #2 Reinforcement learning from human feedback (RLHF) plays an important role in aligning language models to human preferences. However, there has been some discussion about whether RLHF is actually reinforcement learning at all. The environment for RLHF consists of the model itself, and no new data is acquired during the training process. The only way in which additional data is incorporated into the training is in the supervised training of the reward function. Recently, this discussion has been exacerbated by the publication of the Direct Preference Optimization algorithm, which bypasses reinforcement learning entirely. In this blogpost we will discuss related works, highlight the information flow of RLHF, and analyze to which extent alignment requires RL for modern applications of LLMs. |
Mischa Panchenko 🔗 |
Thu 7:30 a.m. - 9:30 a.m.
|
Unraveling The Impact of Training Samples
(
Poster
#1
)
>
link
Poster Location: Halle B #1 How do we quantify the influence of datasets? Recent works on Data Attribution Methods shed light on this problem. In this blog post, we introduce Data Attribution Methods which leverage robust statistics and surrogate functions, and present their applications like distinguishing the feature selection difference of learning algorithms, detecting data leakage, and assessing model robustness. |
Daiwei Chen · Jane Zhang · Ramya Vinayak 🔗 |