A representative of Sberbank's press service did not answer questions from a ComNews correspondent.
Posted: Sun Jan 19, 2025 6:38 am
A representative of the ABD press service told a ComNews correspondent that the preliminary draft of the national standard is based on global experience, current developments and research by the association in the field of a risk-oriented approach to data processing and the capabilities of neural network technologies. According to him, representatives of the regulator, the scientific community and experts from companies that are members of the association are participating in the development.
"Many ABD participants already use synthetic data for internal tasks (testing), we have summarized this experience and hope that synthetic data will contribute to sweden telegram the development of the data market and artificial intelligence. The current version of the document has already been submitted for consideration to TC 164, and, according to the standardization procedure, after the discussion stage, the final version will be presented in the second quarter of 2025. Synthetic data, in our opinion, opens up new horizons for increasing the availability and openness of data, ensuring their safe use without threats to privacy. They are becoming an effective alternative to anonymized data and create new conditions for the development of AI, which helps strengthen Russia's technological sovereignty," said a representative of the ABD press service.
Read also
In 2025, large language models will come to life again
Artificial intelligence (AI) experts have concluded that there will be insufficient data to train large language models in 2024 and have identified ways to solve this problem in 2025.
Synthetic data mimics real human-generated data but is created using computational algorithms and generative AI models. It has the same mathematical properties as real data but does not contain the information from real data. AI companies use it to train language models and test machine learning. Synthetic data is a virtually infinite source of information for training AI, as the developer can create an unlimited amount of it. It can be used for research in areas filled with confidential information and protected by regulations, such as copyright, healthcare, finance, etc. Finally, synthetic data can reduce the level of bias in the trained AI models by contrasting itself with biased formulations or opinions extracted from publicly available sources.
Recall that earlier, in December 2024, one of the creators of ChatGPT and co-founders of OpenAI, Ilya Sutskever, said that the growth of computing power for AI models has outpaced the speed of data generation and that the neural network industry has reached a peak in information use.
Read also
"Many ABD participants already use synthetic data for internal tasks (testing), we have summarized this experience and hope that synthetic data will contribute to sweden telegram the development of the data market and artificial intelligence. The current version of the document has already been submitted for consideration to TC 164, and, according to the standardization procedure, after the discussion stage, the final version will be presented in the second quarter of 2025. Synthetic data, in our opinion, opens up new horizons for increasing the availability and openness of data, ensuring their safe use without threats to privacy. They are becoming an effective alternative to anonymized data and create new conditions for the development of AI, which helps strengthen Russia's technological sovereignty," said a representative of the ABD press service.
Read also
In 2025, large language models will come to life again
Artificial intelligence (AI) experts have concluded that there will be insufficient data to train large language models in 2024 and have identified ways to solve this problem in 2025.
Synthetic data mimics real human-generated data but is created using computational algorithms and generative AI models. It has the same mathematical properties as real data but does not contain the information from real data. AI companies use it to train language models and test machine learning. Synthetic data is a virtually infinite source of information for training AI, as the developer can create an unlimited amount of it. It can be used for research in areas filled with confidential information and protected by regulations, such as copyright, healthcare, finance, etc. Finally, synthetic data can reduce the level of bias in the trained AI models by contrasting itself with biased formulations or opinions extracted from publicly available sources.
Recall that earlier, in December 2024, one of the creators of ChatGPT and co-founders of OpenAI, Ilya Sutskever, said that the growth of computing power for AI models has outpaced the speed of data generation and that the neural network industry has reached a peak in information use.
Read also