DingTalk集成第三方AI模型:开放生态系统的新篇章

发表时间: 2024-06-27 15:42

TMTPOST--"Without relying on complex scenarios and data, purely knowledge-based Q&A models have no application prospects in the industry," said Ye Jun, President of DingTalk, an intelligent work platform created by Chinese e-commerce giant Alibbaba Group.

Software as a service (SaaS) companies have faced unprecedented challenges in the past years, such as high costs and customization difficulties. "To meet market demands, many SaaS software solutions have been adapted to reduce costs, and AI large models have made interactive interfaces more user-friendly," Ye noted on Wednesday in a dialogue with Liu Xiangming, the co-founder and co-CEO of TMTPost Group.

Artificial intelligence (AI) large models, despite being relatively new to the public, are rapidly transforming various industries. However, transitioning from technology to practical application presents several challenges that require collaboration across the industry.

Software as a service (SaaS) companies have faced unprecedented challenges in the past years, such as high costs and customization difficulties. "To meet market demands, many SaaS software solutions have been adapted to reduce costs, and AI large models have made interactive interfaces more user-friendly," Ye noted.

DingTalk, has increased commission rebates this year to support ecosystem partners, addressing the challenges of SaaS transformation and cost-efficiency in the AI era in China.

The development of a robust ecosystem is crucial for the "rebirth" of SaaS vendors. AI large models currently face high usage costs and low customization, along with commercialization challenges. Domestic large model vendors are engaged in a price war, and product homogeneity is stifling innovation. Addressing these issues and making AI large models affordable and user-friendly is now a priority.

DingTalk's ecosystem advantage lies in its open platform, which allows various large models to access customer scenarios and data in a safe manner. Users can seamlessly and securely utilize multiple models for highly customized development. This collaboration is essential as AI large models transition from technical competition to practical applications.

DingTalk held its third ecosystem conference in Beijing on Wednesday. The company announced it would open its platform to all large model vendors, aiming to build the most open AI ecosystem in China. The first batch of six large model vendors—MiniMax, Moonshot AI, Orion Star, Baichuan AI, Zhipu AI, and Ling Yi Wan Wu—have already integrated with the DingTalk platform.

Ye emphasized that many general-purpose large models lack good application scenarios. "Scenarios drive innovation, but large model vendors often lack both scenarios and customers. DingTalk can help these vendors find both, reducing costs for enterprise users."

Minimax founder Yan Junjie highlighted the value of cooperating with DingTalk, citing its vast market and presence in various industries. In April, DingTalk launched the AI Agent Store, initially offering over 200 AI assistants across various categories. By the end of May, the total number of AI assistants reached 500,000, with over 700 available.

Zhipu AI COO Zhang Fan said that applying large models involves significant opportunity costs. The cooperation with DingTalk covers model capability integration, toolchain construction, and transforming an agent platform into real applications across various enterprise scenarios.

As enterprises seek to profits from AI, making it a high-frequency, repeatedly used function is essential. Building an AI ecosystem is an essential step in the development of the large model industry. Ye aims to increase AI usage, with a goal of reaching 50 million calls per day, noting that the current number of AI users in China is below 10 million.

The following is a transcript of the conversation between Ye and Liu, edited by TMTPost for clarity and brevity.

TMTPost: What do you think about the development of Agent Store now?

Ye:I think creating a general-purpose Agent Store is meaningless. Without the support of scenarios and data, and without complex processes, creating an agent similar to pure knowledge Q&A is pointless. On the very day GPT was launched, there were three million bots. I looked at it and said, "Don't bother, this is meaningless." So, DingTalk is very restrained. If we are going to do it, we will definitely do something valuable. Currently, we only have over 700 agents on the shelf.

TMTPost: What do you think is a meaningful agent?

Ye:There are many. One category is for creation. Most of the time, people in companies are engaged in creation, whether it's employees, design departments, or marketing departments. They create copywriting, planning, and marketing posters. This is a new mode of creation in the AI era. In the past, creation required switching between various applications, and collaboration between systems was quite troublesome. When using multiple independent AI platforms, coordinating creative products was very annoying. On DingTalk, you can collaborate, not only with yourself but also with others.

Another category is RPA (robotic process automation). RPA had already shown some promise even without AI. Traditional RPA involved manual clicking, turning batch processing into automated processes. With the addition of AI, RPA becomes more intelligent. For example, from the HR system to the order system to the financial system, the status changes related to orders, changes in amounts, and the subsequent handling by financial personnel, as well as the flow of funds between finance and the bank, were previously disconnected. RPA can achieve data status updates from sales personnel to financial personnel.

However, traditional RPA has a problem. If the system changes and the buttons change slightly, RPA cannot understand and execute. But with intelligence, AI can recognize the next intention, making the process smarter. Whether to perform a click action or another action, RPA gains the ability to make judgments. Intelligence makes every step of digitization smarter.

This is the collaboration between multiple systems. The biggest difference between our AI and other vendors is that we have data and systems. Such complex systems bring scenarios, becoming the best stage for AI to play its role.

TMTPost: What scenarios are RPA currently used in most effectively?

Ye:The top-ranked stores are still okay, but actually, most AI assistants haven't been released because they are used internally by companies. Since their AI assistants are tied to their own data, it's impossible for them to let you use their AI assistants. When large models emerge, there will be more opportunities for privatized scenarios.

TMTPost: What is the process for companies developing RPA now, and what role does DingTalk play in it?

Ye :In summary, it involves memory - thought chain - integration with large models - integration of local data.

We have set up the framework and done several parts of the work. First, we connect with different models. We have integrated with Orion Star and mini max, allowing companies, developers, and super individuals to develop AI assistants. Users can choose the large model they think is most suitable or the model already deployed internally in the company.

Second, the AI assistant pre-sets the basic work of the thought chain to memory capability. The memory capability of AI is more intelligent than the so-called personalized settings of previous search engines. The thought process can break down tasks, first understanding the task with a language model and then breaking it down into several steps. Although it's not that advanced yet, I think it's already a huge progress.

Third, through the DingTalk platform, the originally existing calendar, schedule, documents, knowledge base, and other discrete data can be integrated without the need for API calls, and integration can be achieved through the AI assistant.

TMTPost: How do you evaluate AI assistants? What is the most used feature?

Ye:In terms of usage, there are currently about one million daily uses, with an average of seven uses per person, and about ten million calls per day. The most used feature is DingTalk's own integrated AI assistant.

TMTPost: Does the current "shallow" application of the Agent Store also represent the current state of AI applications?

Ye:I think AI is still in its early stage. Current applications are still limited to chat scenarios. But the real value of large AI models lies in privatized scenarios after integrating unique data. Only when large models enter projects and companies will it truly begin to take root.

Based on this, DingTalk is determined to follow the enterprise route, integrating with enterprise data, allowing AI to directly connect to internal company databases, and directly generating reports through model management.

Moreover, in the future, personnel from business departments can also build large models that empower their respective businesses through simple database configuration.

TMTPost: In the future, what will be the relationship between Agents and Cool Apps?

Ye:Cool Apps serve as an interactive interface, with their greatest value being the ability to recommend various functions to users. With the advent of large models, enterprises can voluntarily interact with the interface by calling small components. The combination of Cool Apps and large models can also address the issue of "hallucinations" in large models through the structured nature of Cool Apps.

To solve the problem of "hallucinations" in large models, it is essential to consider using structured data forms to assist Agents. Cool Apps and low-code solutions largely address the issue of structured data.

TMTPost: In the AI era, what do you think is the biggest change for the DingTalk team in terms of products and the team itself?

Ye:AI has brought significant changes to DingTalk as a whole, to the extent that all modules can be reshaped by AI.

Previously, DingTalk was more of a chat tool. Now, DingTalk integrates a large number of application scenarios and connects them through RPA, bringing an entirely different interactive experience. We have eliminated many of the cumbersome features from before and adopted a different approach. For example, in DingTalk team approvals, the approval process now does not exceed three levels, returning decision-making responsibility to the frontline employees who raise requests, thus achieving self-drive. This makes the company more flexible and faster.

TMTPost: What about other changes?

Ye:First, we can now use AI capabilities to remember and process documents. Enterprises can "feed" document content to AI, which can then organize the key points and help people "remember."

Second, even enterprises without particularly strong technical capabilities can use AI assistants similar to no-code solutions to build internal and external customer service, internal employee knowledge bases, external product manuals, and other relatively standardized scenarios.

Third, in creative scenarios, enterprises can deeply integrate work systems, production systems, and marketing systems. By leveraging the capabilities of large AI models, they can assist in intelligent decision-making and production scheduling. This step requires a certain degree of customized development to make the large model understand your business system.

TMTPost: What is the main obstacle for enterprises to reach the third step of deeply integrating large models with business systems?

Ye:The biggest obstacle is still the cost of engineering, as this is not a process that can be quickly advanced through productization.

But this obstacle also brings opportunities for many companies. Many companies used to do projects, now they do AI integration. With AI, the product experience of many enterprises has changed.

If this change is not significant enough, the motivation for them will be insufficient. Take DingTalk as an example, now AI has been integrated every aspect of DingTalk's products, and compared to other technologies, the changes it brings are disruptive. However, due to the diversity of DingTalk's user base, DingTalk still needs to make changes step by step, and cannot change the habits of one group of people for the sake of another.

TMTPost: How to build the AI ecosystem? What role does DingTalk play in it?

Ye:DingTalk is a catalyst. For example, nearly 100 SaaS companies have undergone AI transformation and integrated with DingTalk. But I think it's still not enough, because many companies are still reluctant to invest in the AI industry as the output is not big enough.

TMTPost: For these SaaS companies that have undergone AI transformation, what is the biggest change you have seen?

Ye:Through AI transformation, SaaS companies have changed their software interaction methods, making the high-frequency interface interactions of SaaS software more user-friendly. After accumulating for a period of time, the SaaS market will also see some improvements through AI, and the costs of these SaaS companies will be reduced through large model technology.

TMTPost: Do you think that whether it is SaaS or Agent, through AI transformation, in which application scenarios will breakthroughs be achieved in the future?

Ye:First of all, the most obvious is the customer service scenario. Especially for customer service calls within 30 seconds, it can be fully AI-automated, saving a lot of labor costs.

Secondly, RPA linking multiple systems is also a scenario with obvious breakthroughs. The previous RPA was too rigid, but with a large model and AI assistant, batch processing can be performed, which is particularly suitable for batch processing in financial, e-commerce, and other scenarios.

TMTPost: The future payment model for Agents may change tremendously. Firstly, it is not as difficult to charge for as software, and no matter how it is sold, it feels expensive. Moreover, future software may just be buying Agents.

Ye:The price of an Agent actually corresponds to the cost of a certain task. For example, if a task originally required a certain amount of time or labor cost, using an Agent can change that, which is easy to measure. This is similar to RPA; a task that originally took two hours can now be completed in ten minutes, saving the cost of time.

TMTPost: As you see, what are the differences in supporting a SaaS application versus a future AI application or AI Agent for an ecosystem?

Ye:Currently, there are still relatively few companies that are AI-native. Some SaaS companies are working on it, but due to the pressure of survival challenges, these are only used as differentiated competitiveness and have not become core products. However, in the future, AI-native startups will definitely emerge.

The service side of SaaS will definitely remain unchanged, but the interaction side on the front end should all be handled in the form of assistants. I think this form of SaaS is somewhat like new public accounts or new mini-programs. Previously, we never thought that mini-programs could also play games. In the future, AI should also become this complex.

TMTPost: In your view, what is the difference between AI-native applications and AI-improved applications?

Ye:Most applications now are still plugin-type applications with AI assistants, only replacing a task and not an industry or product, merely changing the efficiency of a subtask.

The development process of AI-native applications, including the size of the company, will change. For example, if a person starts their own small company, the first thing they develop on the first day of entrepreneurship is a native application. They will no longer create the original interface interaction or engineer process system, only needing to connect core logic, call data, and focus mainly on AI training, AI tuning, knowledge base, memory management, and thought chain optimization... The personnel structure, workload distribution, and project management of the company will undergo significant changes.

TMTPost: What different capabilities will you provide to partners to truly build an AI ecosystem?

Ye:We will provide two capabilities. In the model ecosystem, we give scenarios to partners. Because now general large models lack scenarios, and of course, they also lack customers. We can also provide them with customers. But I think scenarios are more important than customers because scenarios inspire innovation.

DingTalk will integrate the characteristics of various large models and collaborate with large model vendors to explore the application of these capabilities in DingTalk's own products, leveraging the strengths of different large models. It will also open up the creation of AI assistants (AI Agents), allowing developers to choose the appropriate large model based on their needs.

For SaaS products, we inject its main interface into the high-frequency pages of DingTalk chats and also open up the cool application blocks. For the application creation ecosystem, what partners ultimately need are users.

TMTPost: How do you evaluate these different large models in this collaboration? What are their strengths? How do you negotiate over terms, prices, and cooperation methods?

Ye:We talk with vendors about which scenarios they prefer to work on. Large models have different parameters, prices, inference speeds, and capabilities. We will clearly document these characteristics, allowing users to choose freely based on their needs.

Currently, it is officially free for a limited time, but later on, charges will be determined based on the complexity of the functions. The business model we are discussing with vendors is mainly subscription-based per function. One important judgment we have is that the number of times will not become a bottleneck because of the current price drop in large model calls. The number of calls is just a figure, but it is estimated that it will never be reached.

TMTPost: How do you see the difference between the AI ecosystem and the original SaaS ecosystem?

Ye:Because the AI ecosystem processes data, while the original SaaS ecosystem processes workflows. Workflows are relatively generic and are an abstract result where the system decides what to do at each step, but AI directly goes into the "how to do" aspect, which is based on data.

TMTPost: Initially, you ran the entire DingTalk scenario with the Tongyi large model. Now, you are exploring it again with a bunch of models. What differences have you found so far?

Ye:First, it can trigger more customer demands. Second, the performance and cost of these large models are different. Their existence is inevitable because of differentiation, and there are indeed some specialties. Third, the differentiation of customers. Overall, the market performance is becoming increasingly healthy, which is certainly beneficial for industry development.

TMTPost: DingTalk is now also developing its own products, including its own AI assistant. Where are the boundaries in the future AI ecosystem?

Ye:Today, we limit this boundary to the Agent's entry and definition creation. The Agent's own capabilities, skills, underlying reasoning abilities, and results are all handed over to the ecosystem. The AI Agent has to perform many interactions behind the scenes, and we delegate the response capabilities to the ecosystem. It's akin to handing over the CPU to the ecosystem while we handle the bus and the integration, ensuring the circulation of all core components and providing a unified interface.

TMTPost: In the ecosystem, how can data be effectively isolated to ensure security while also being integrated to maximize its effect? How do you balance this? DingTalk seems to play a gatekeeper role.

Ye:From the beginning, we have been working on tenant isolation, and now the isolation is becoming cleaner. We have also implemented localized storage for many key data points, such as files, documents, emails, low-code processes, and IM chat messages. Some small enterprises don't care much about this because they don't think it's worth the cost, but as companies grow, they become more concerned. They can then take the data, train it locally, and use the model locally. I believe many large enterprises will reach this stage.

TMTPost: Currently, some enterprises access large models through platforms like DingTalk, but there are also some, especially large enterprises, that choose to develop a large model themselves. What do you think are the pros and cons of these two modes?

Ye:Choosing which large model to use is largely a commercial and market behavior. Large enterprises have many customized and tailored needs, requiring deep applications and a lot of specific adjustments. Especially for industry-specific large models, they need to find models for development themselves. But small enterprises don't have such deep needs and don't need to do this as the cost is too high.

TMTPost: Finally, do you think DingTalk's opportunities and future profit models will change in this round of AI ecosystem construction?

Ye:It's not just DingTalk; all companies want to profit from AI. Before making a profit, the first problem to solve is to make it a high-frequency, repeatedly used feature. So, I set a goal for myself: 50 million requests for use per day.

Once this goal is reached, I can tell this story and deduce the profit margin from the cost, which serves as a baseline. Moreover, after reaching this number of calls, I won't be too concerned about the exact times of uses because once the number reaches a certain level, the marginal cost becomes very low, and we won't charge per call but rather a fixed amount. But the question is, do you have a feature that users will use frequently? However, in our discussions with many SaaS model partners, we found that they haven't found a particularly good way to profit either. If it ultimately turns into a project-based model, it will eventually "die."