Alibaba staff offers glimpse into life of building LLM in China

AdminFebruary 27, 2024

Chinese tech companies are gathering all sorts of resources and talent to narrow their gap with OpenAI, and experiences for researchers on both sides of the Pacific Ocean can be surprisingly similar. A recent X post from an Alibaba researcher offers a rare glimpse into the life of developing large language models at the ecommerce firm, which is amongst a raft of Chinese internet giants striving to match the capabilities of ChatGPT.

Binyuan Hui, a natural language processing researcher at Alibaba’s large language model team Qwen, shared his daily schedule on X, mirroring a post by OpenAI researcher Jason Wei that went viral recently.

The parallel glimpse into their typical day reveals striking similarities, with wake-up times at 9 a.m. and bedtime around 1 a.m. Both start the day with meetings, followed by a period of coding, model training and brainstorming with colleagues. Even after getting home, they continue to run experiments at night and ponder on ways to enhance their models well into bedtime.

The notable difference is that Hui, the Alibaba employee, mentioned reading research papers and browsing X to catch up on “what is happening in the world.” And as a commentator pointed out, Hui doesn’t have a glass of wine after he arrives home like Wei does.

This intense work regime is not unusual in China’s current LLM space, where tech talent with top university degrees are joining tech companies in droves to build competitive AI models. To a certain extent, Hui’s demanding schedule reflects a personal drive to match, if not outpace, Silicon Valley companies in the AI space. It seems different from the involuntary “996” work hours associated with more “traditional” types of Chinese internet businesses that involve heavy operations, such as video games and ecommerce.

My typical day as a Member of Technical Staff at Qwen (Just for myself):
[9:00am] Wake up, might stay in bed for an extra 15 mins.
[9:30am] Taking a cab to work, browsing X to catch up on what’s happening in the world, checking out @_jasonwei ‘s latest post.
[10:00am] Work… https://t.co/7o47EQrWcW

— Binyuan Hui (@huybery) February 21, 2024

Indeed, even renowned AI investor and computer scientist Kai-Fu Lee puts in an incredible amount of effort. When I interviewed Lee about his newly minted LLM unicorn 01.AI in November, he admitted that late hours were the norm, but employees were willingly working hard. That day, one of his staff messaged him at 2:15 a.m. to express his excitement about being part of 01.AI’s mission.

Such work ethics partly explain the rapid speed at which China’s tech firms are able to introduce LLMs. Qwen, for example, has open sourced a series of foundation models trained with both English and Chinese data. The largest has 72 billion parameters, which are like knowledge a model gains from historical training data and define its ability to generate contextually relevant responses. The team was also quick to introduce commercial applications. Last April, Alibaba began integrating Qwen into its enterprise communication platform Dingtalk and online retailer Tmall.

No definite leader has emerged in China’s LLM space so far, and venture capital firms and corporate investors are spreading their bets across multiple contenders. Besides building its own LLM in-house, Alibaba has been aggressively investing in startups such as Moonshot AI, Zhipu AI, Baichuan and 01.AI.

Facing competition, Alibaba has been trying to carve out a niche, and its multilingual move could become a selling point. In December, the company released an LLM for several Southeast Asian languages. Called SeaLLM, the model is capable of processing information in Vietnamese, Indonesian, Thai, Malay, Khmer, Lao, Tagalog and Burmese. Through its cloud computing business and acquisition of ecommerce platform Lazada, Alibaba has established a sizable footprint in the region and can potentially introduce SeaLLM to these services down the road.

My typical day as a Member of Technical Staff at OpenAI:
[9:00am] Wake up
[9:30am] Commute to Mission SF via Waymo. Grab avocado toast from Tartine
[9:45 am] Recite OpenAI charter. Pray to optimization Gods. Learn the Bitter Lesson
[10:00am] Meetings (Google Meet). Discuss how to…

— Jason Wei (@_jasonwei) February 20, 2024

Original Source Link

AdminFebruary 27, 2024

Cookie	Duration	Description
cookielawinfo-checkbox-analytics	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional	11 months	The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance	11 months	This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy	11 months	The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.

THE GABI THAT GIRMA WORE

Sharyl Attkisson Points Out How the Media's Mistakes About Trump Always Go in Only One Direction (VIDEO) | The Gateway Pundit

Related Articles

WordPress.com owner Automattic snaps up grammar checker Harper

The NBA will air a Christmas Day game with Disney characters

Riding high on open source ERP, Odoo raises $527M via secondaries lifting its valuation to $5.26B

Sony will trial cloud streaming for the PS5 Portal