@maven_hq: AI Evals: how to measure the performance of AI Systems Evals are increasingly becoming one of the most critical parts of agentic systems. #ai #evals #agenticai #maven

Maven
Maven
Open In TikTok:
Region: US
Monday 04 May 2026 18:58:36 GMT
30904
1669
14
147

Music

Download

Comments

liquidlettuce
liquidlettuce :
But who evaluates the evaluator??
2026-05-05 02:49:37
4
the_ninety_and_nine
don :
Evals improve output but they are not deterministic. They’re very flappy.
2026-06-17 13:04:17
0
seattleaigal
seattleaigal :
PEAS framework🔥
2026-05-25 01:16:44
0
g33kg1rlg0nemixing
g33kg1rlg0nemixing :
next step...add a referee
2026-05-04 20:17:11
0
ih4tetikt0k1
ih4tetikt0k1 :
Naturally, you would want a specialized LLM to judge the output. But some of the good LLM judges are black box. Sometimes I wonder if they are just classifiers rather than LLMs
2026-05-10 16:08:09
0
xwhisprs2
thomas :
But how reliable is a eval agent?
2026-05-04 19:12:33
0
fullvaluedan
Full Value Dan :
what's the workflow to setting up the graphics?
2026-05-23 15:00:29
0
miketktok2
Mike :
The course is good but quite expensive
2026-05-05 16:11:00
0
mattludt2026
mattludt2026 :
No cats today
2026-05-05 03:30:14
0
ishtyaqahmed10
Ishtyaq Ahmed :
This is amazing!
2026-05-04 23:26:41
0
ahriman62
P-LU☕️🤌SA :
This is like Metadata ?
2026-05-05 03:45:54
1
dapbooks21
dapbooks21 :
😁😁😁
2026-05-12 23:46:55
0
To see more videos from user @maven_hq, please go to the Tikwm homepage.

Other Videos


About