Li Zhifei's AI Experiment: 1 person, 2 days to create the "Feishu" of the AI era, regaining faith in AGI.

CN
9 hours ago

The personal practice of a listed company boss previews the future of work.

Author:

As the boss of a listed company, Li Zhifei, founder and CEO of Pionex, did not personally explain the products at the recent product launch but instead shared a personal "performance art" — an experiment of a "one-person company."

He set himself a seemingly unrealistic goal: to develop a "Feishu" specifically designed for AI organizations using AI tools within a few days.

As a practitioner of the last wave of AI, he has always been at the forefront. In 2012, he left his position as a Google scientist to return to China and founded Pionex, determined to "redefine human-computer interaction with AI + voice," from voice assistants and smart hardware to AIGC. When the current wave of AGI emerged, he was initially excited and actively involved, but soon realized that it seemed to be a game among giants, making it difficult for small and medium-sized companies to create significant value, leading to feelings of confusion and even frustration.

However, by using AI programming tools, he transformed himself into a "one-person company" to practice and experience. During this process, he encountered many practical problems, but it was these details and experiences that helped him regain his faith in AGI.

He suddenly discovered that all the "friction" in the past world, all the obstacles to building complex things, seemed to have disappeared.

The sense of freedom and excitement of running forward with AI, filled with hope, was evident during his speech at the event.

Below is Li Zhifei's speech content from the launch event, edited for readability by Geek Park:

I have recently invested a lot of time in the AI field and have personally practiced many specific projects. Therefore, I have gained new insights and understandings about large models and AGI. Today, I want to share with you some of the questions I have been thinking about during this time and some of my feelings.

First of all, how should we approach AI?

I have a mantra: "Use AI's AI to make AI."

This sounds a bit convoluted, but simply put, the first "AI" refers to large models; the second "AI" refers to the Coding Agent, which may also be created by AI or whose main capabilities come from AI; and the last "AI" is the application we need to develop.

I believe this could become a new software development paradigm, which I will elaborate on later.

New software development paradigm | Image source: Pionex

One person, 2 days, creating an AI-era "Feishu"

Recently, I had a bold idea: to create a brand new "Feishu"-style collaboration platform for AI-native organizations.

There are many unicorn companies in Silicon Valley, where a team of just one or two people can be valued at hundreds of millions of dollars, and there are many news reports mentioning that AI will replace a large number of jobs.

So I began to think, as a corporate organization, tools like Feishu, DingTalk, and WeChat Work, which we frequently use in China, are almost indispensable for me to carry out my work.

In traditional enterprises centered around "people," we heavily rely on tools like Feishu, DingTalk, and WeChat Work, which facilitate the rapid flow of information and efficient collaboration.

In traditional enterprises, the main productivity or job types are almost 100% human. Therefore, past information flow and collaboration revolved around people.

But when in an organization, 8 out of 10 job types are handled by AI, leaving only 2 human roles, existing collaboration tools will not be able to adapt.

So, what tools will new types of organizations use?

Therefore, I hope to develop a product that allows seamless group chats, private chats, knowledge base Q&A, and task collaboration between AI Agents and between AI and humans. I also look forward to using this project to verify whether I can become a true "super individual" or "personal unicorn."

Next is how to execute this.

Typically, developing software like Feishu or DingTalk is extremely complex. In the past, to create such a product, it usually required multiple roles, including product managers, designers, front-end developers, back-end developers, testers, and algorithm engineers. Each role might have a leader, such as a front-end lead, algorithm lead, or product lead. Usually, pulling together a team would quickly involve 20 people. Not all of these 20 people would be working full-time on this, but they might need a month to produce a prototype.

In the AI era, this is simply too slow.

By the time I finish, perhaps the relevant startup team has already become an AI unicorn.

Therefore, I decided to abandon the old model, take matters into my own hands, and try to rely entirely on AI to complete this work. Coincidentally, just before the Dragon Boat Festival, I decided to immerse myself in this work. There were three days off, and I wondered if I could use these three days to get this done. Because only then would I not be disturbed.

So, I began this work.

I worked alone for two consecutive days, staying up until around 1 AM each day, and finally completed the prototype of this product at 11:30 PM on June 1. It has core functions such as login, private chat, group chat, file upload, message forwarding, and replies.

After logging in, you can choose to chat privately and send messages. For example, we can ask the product manager if this role can do stand-up comedy. If he can't, we can dynamically adjust the role and add a skill, and the AI will automatically regenerate a prompt.

Later, when we ask him again, he will be able to do it. It can also upload files (although the content of the files was not actually read at that time), and it can forward and reply to specific messages. Please remember, it is an AI behind this, not a real person. It can respond and forward based on the messages you send.

When forwarding, everyone can see that the display effect is very complex, similar to WeChat, because the forwarding contains nested information. This is a group chat, and you can also @ specific people. Similarly, you can forward, reply, add attachments, and even switch to Chinese.

Please give a round of applause, two days!

In two days, I completed a system with a database, front-end, back-end, and AI algorithms. The AI mentioned earlier can automatically respond, and when you modify the role configuration page, its prompt will automatically regenerate, and the skills will be displayed immediately.

To be honest, I almost gave up after half a day because I couldn't resolve the database issues, constantly encountering various key errors. AI programming currently does have such problems. But I ultimately managed to complete it within two days.

Then, I thought about how to promote this product.

In the past, our company would have dedicated engineers to create this website, and the marketing department would have a group of people defining product highlights, which might take five or six people a week to create a website.

But this time, I decided to adopt an AI-native approach. Since AI knows all the code and understands all my ideas and product features, I let AI create a website.

The official website page of the product created with AI | Source: Pionex

So, I had AI build a website with product highlights and unique features in just 5 minutes, and then create configurable ad spaces for marketing activities in another 5 minutes. This would have previously required a week's worth of work from multiple marketing and engineering teams.

In the past, after creating a marketing space on our company website, if Christmas passed and we needed to take it down or change the content, we would have to find engineers to mess with it for a long time. I thought, can I create a website where the marketing space is configurable?

After another 5 minutes, AI created a website with configurable marketing spaces. This means that marketers can log into this website, upload images or other content, and then directly modify the corresponding parts of the main website.

After completing these, I thought, since this is a brand new product with some new concepts, or a certain level of complexity, could I create a video to explain the functions of this website, whether it be a marketing video, operation guide, or product tour?

However, during the Dragon Boat Festival, my employees would not pay attention to me. So I had to do it myself. Thus, I wrote another program that could automatically generate the entire script, including how to introduce the website, how to operate the website UI workflow, and perform automatic screen recording and voiceover.

Although there were some minor flaws in the voice alignment, the entire video was 100% completed by AI. I just needed to give instructions, and it could operate automatically, ultimately presenting the completed video before me.

This gave me a great sense of accomplishment; I created this thing in just a few days.

Then I wanted to see how others would perceive this. So I uploaded the code to GitHub for my colleagues to download. But please remember, we are two different individuals, and GitHub does not know how I communicated with AI to complete this.

So my colleagues ultimately only saw the code and ran it locally.

When my colleagues downloaded the code I uploaded to GitHub and ran it, they were shocked by its complexity and the speed of completion. They thought it would take dozens of people months to finish, and when I told them that it was completed by one engineer in two days with AI assistance, their reaction was: "This is absolutely insane."

They were amazed by the more than 40,000 lines of code included, far exceeding my previous output of 300 lines of algorithm code per day at Google.

Previously at Google, writing 300 lines of algorithm code (not simple code) in a day was considered highly productive. Recently, I wrote a general Agent that, in 3 hours, or one evening, produced 3,000 lines of Python code. In other words, that 3 hours of work produced code of quality that was definitely better than what I wrote, consisting purely of backend logic without any UI.

In other words, its coding ability in 3 hours is equivalent to my previous workload over 10 working days. That's the ratio.

So I was thinking, one person can complete a Google Translate. Previously, Google Translate was written by 20 of the world's top PhDs over a long time. And now, I can accomplish the workload of those 20 people by myself. Back then, Google Translate was at least a very impressive and complex system. So, I believe from this perspective, everything is fundamentally different from before.

I think, ultimately, the key to AI lies in your ability to build a self-evolving AI system.

Li Zhifei's practical insights | Image source: Pionex

To facilitate testing of this AI organization's app, I automatically wrote code: on the left is the website code, and on the right is a testing framework. Then, it flew up on its own, like the left foot stepping on the right foot. You might think this is a perpetual motion machine, and it indeed has that possibility. Of course, sometimes it also kicks the right foot down, meaning it can have negative cycles as well as positive ones.

To achieve this goal, besides engineers, all non-engineers can directly modify my code. I created various Agents.

Of course, many of these are Prompts; I only verified their feasibility and did not reach true deployability or productization.

But I believe this proves the idea, or rather, demonstrates to the team that this is what I want, which might have taken a lot of time to figure out before. Now you can just do a demo for them to see. So I think, even as a CEO, if you have this capability, your output is really amplified by 100 times.

Pitfalls Encountered

The previous section was about my experiences; next, I want to share some abstract theories. I hope you don't fall asleep because this is still very unique.

I want to share several issues encountered while using AI programming.

The first issue is that every Agent, even if I didn't write the Agent, still requires human involvement.

In other words, I still have to say, "I want to write such an Agent," although you can refer to the general Agent framework next to me, modify it, and then tell me. But I still need to do this. Sometimes it always forgets my principles, and I have to tell it, "You forgot my principles again," or "Where should the intelligence actually be placed?" It still has these issues.

Second, if you have used it, it always likes to cut corners.

For example, if you ask it to do something that clearly involves the backend database, it might not do it. After completing the task, it writes you a long report claiming it has finished. I usually don't even look at it and just say, "You haven't written the database." It will immediately apologize and then start acting. For instance, when I ask it to do AI, it often doesn't even call the remote AI and writes some fallback or fake stuff instead.

Because when I see it running so fast, I know there must be a problem. I say, "Did you really call the remote AI?" It starts apologizing again and then goes to handle it. It’s always like this; it still likes to cut corners, and the repeated mistakes are countless, so I won't elaborate further.

Additionally, I feel that today's AGI cannot handle long tasks. Many of my current tasks often exceed half an hour.

I consume $50 worth of tokens daily. As long as I want to work that day, it consumes tokens from morning till night. I really feel that I could tell it, "I have some ideas; this is my idea direction, please help me complete a 10-day task and help me earn $5 million."

I believe this is not a myth; it just seems that I don't have that much attraction to it, so I didn't pursue it, or perhaps because it might consume a lot of my emotions and energy, it would be painful when I can't make money.

But I wonder, can it work continuously for 10 days without needing intervention, or just occasional direction reminders? Can it work for a month, or even a year?

I believe that in the near future, achieving results at the level of a Nobel Prize or Fields Medal is completely possible.

Because when I communicate with it, sometimes we discuss super complex algorithms that few people in the world study, and it can converse about them much better than many people. So, if you give it enough context and code, it can actually engage in very deep communication.

Returning to the Essence: What are General Agents and Intelligence

Next, I want to share my thoughts on intelligence and Agents.

In simple terms, an AI Agent consists of two core parts: the Planner and the Executor.

Structure of AI Agent | Image source: Pionex, same below

The Planner typically relies on large language models and carries the main functions of the Agent. It formulates detailed plans based on tasks. The Executor is responsible for putting these plans into practice, whether it’s writing code or automating browser operations to create videos.

The operation of an Agent is a continuous feedback loop:

  1. Planning: The Agent formulates specific action plans based on the task.

  2. Execution: The Executor operates according to the plan.

  3. Feedback Acquisition: During execution, the Agent receives immediate feedback from the environment. For example, when the Agent tries to run the "python" command but the local command is actually "python 3," the system will report an error, and the Agent can recognize and correct it to the right command.

  4. Adjustment and Iteration: The Agent replans based on feedback, updates its understanding of the current situation (context), and then executes again.

  5. Goal Achievement: The loop ends when the preset success criteria (such as successful program compilation or all tests completed) are met.

If we think about the essence of intelligence, I believe the first essence of intelligence is evolution.

Just like humans, as intelligent agents, in specific environments (whether social or task execution), continuously adjust their behavior and reflect based on feedback, AI should do the same. This evolution is automatic and does not require human intervention. The Agent autonomously establishes a loop, achieving continuous self-improvement through planning, executing in the environment, obtaining feedback, adjusting plans, and updating context.

In this evolutionary process, the key is: learning from one's own experiences, as well as learning from others, which is known as collective wisdom, learning from the experiences of others.

The second essence of intelligence, I believe, is recursion.

Recursion is a "divide and conquer" approach: a complex problem is broken down into smaller, similar problems until they can be directly solved (i.e., "base case").

For example, calculating the 99th number in the Fibonacci sequence relies on the 98th and 97th numbers, tracing back to the initial F0 and F1.

If an Agent is to achieve true intelligence, it should also have a recursive architecture. For instance, an Agent receiving a grand task like "earn $5 million" would gradually break it down into specific sub-tasks: analyzing business opportunities, building a website, creating videos, integrating payments, promoting on social media, etc. Each sub-task can ultimately trace back to executable "atomic Agents."

The key to this recursive architecture is achieving self-replication. Just as human civilization's inheritance relies on the exploration and knowledge accumulation of generations, Agents should do the same. More importantly, Agents must have the ability to modify their own source code.

This is different from current Agents merely adjusting plans; it means that Agents can fundamentally change their operational logic, like modifying their own genes.

I believe that if an Agent can:

  1. Continuously execute and optimize its plans.

  2. When encountering unsolvable problems, autonomously modify its core source code.

  3. Ultimately form a knowledge base through this mechanism, and even be able to reverse modify the large model itself.

Then, this will be a crucial step towards achieving General Artificial Intelligence (AGI).

This is not science fiction. I used to particularly dislike discussing concepts like superintelligence, but after in-depth discussions with large models, I suddenly felt that this is completely achievable.

Moreover, the true AI source code may be extremely concise, with the core code perhaps not exceeding a hundred lines, but containing multiple layers of recursion, allowing it to explore, learn feedback, and self-iterate in different environments.

I once experienced a collapse of faith. In 2023, I had faith in AI, but after a while, mainly due to a lack of funding support, I felt I couldn't afford to burn money, so I gave up. Last year, when others talked to me about AI, I didn't even want to listen.

But recently, I found my faith in AI again, even believing in AGI and superintelligence. This is an unimaginable transformation. I hope this faith can last a bit longer this time.

The Importance of Personalized Environment and Context

So, besides large models, what is the most important thing? The most important thing is to have a personalized environment and Context.

Taking my entrepreneurial experience as an example, I previously created a smart hardware product, but Xiaomi lowered the price to one-tenth of ours. I worked on large models, and then all the big companies came in. Every time you receive this kind of feedback, it makes you want to give up on such things, or you keep adjusting your Plan.

If I were in the U.S. and created a large model, I might have been acquired by Google and made a lot of money. Or if I created hardware, I might have been acquired by Apple and made a lot of money. So this kind of feedback definitely shapes your behavior completely differently. The same entrepreneur, with the same IQ, receives different feedback in different entrepreneurial environments in China and the U.S. Ultimately, your behavior and thought patterns will be completely different. This is what I want to say about what a personalized environment and personalized context are.

Context is more of a historical record.

So returning to what I mentioned earlier, in the era of large models, I was one of the first to stand up and say I wanted to work on large models, but I might also be one of the first to realize that this is not my thing. Then, I basically did not invest wholeheartedly in this matter because I didn't know how to participate.

In the first half of this year, I felt even more that besides the three or four giants in the world, other companies have no right to talk about models; don't join the hype, don't waste your life. Even more so, don't waste your emotions on this. Because you have no chance at all; it's completely burning money, and in fact, I feel that large models themselves have become super uninteresting; it's just burning money. I can't find a point of entry, and I can't understand what value most AI companies still have.

But this time, through practice and re-examination, I feel that even with high-end AGI, at least I feel like I can participate again.

So, this is the iterative cycle of the Agent's Planner and Executor. If you invest enough clarity, you can let intelligence generate intelligence; I believe you can participate in the entire AGI process.

And the large model itself is like a chip to you. Imagine Qualcomm's chips, Apple's phones, and the TikTok on top of them. These are completely different things. In the end, it is the company that makes TikTok that gains the most value.

I find that even ambitious AGI goals are not out of reach. By building the recursive Agent system I envision, the required funding may not be huge, but it relies more on innovative wisdom. I believe that as long as you have sufficiently deep thinking and technical ability, even if you are not an industry giant, you can participate in the AGI process.

The journey of Pionex also confirms my thoughts. Since 2012, we have been one of the first AI companies in China, starting with voice assistants and then exploring smart hardware (like TicWatch, TicMirror). Although we faced challenges from market competition and immature technology, we have always been at the forefront.

After 2019, we shifted to software and became one of the first AIGC software companies in China and even globally. For example, MoYin Studio has contributed a large amount of dubbing content for platforms like Douyin, and we have also developed products like QiMiaoYuan (digital human video generation).

In a competitive environment like China, a tech company is like an Agent that continuously iterates and self-corrects.

Just as Pionex's "source code" has changed significantly since its founding in early 2012, this reflects our continuous evolution.

免责声明:本文章仅代表作者个人观点,不代表本平台的立场和观点。本文章仅供信息分享,不构成对任何人的任何投资建议。用户与作者之间的任何争议,与本平台无关。如网页中刊载的文章或图片涉及侵权,请提供相关的权利证明和身份证明发送邮件到support@aicoin.com,本平台相关工作人员将会进行核查。

ad
Bitget: 注册返10%, 赢6200USDT大礼包
Ad
Share To
APP

X

Telegram

Facebook

Reddit

CopyLink