This week, we begin with an article about how China is pressing big Tech companies like Tencent, Alibaba, and ByteDance to share massive troves of personal data obtained from their Chinese users. Next, we have a map showing the internet’s biggest sources of data breaches, from June 2011 to date. The following article explains how data science can help identify new insight into the impact of air pollution in the United States. Following that, we have an article about the recent discovery of a vast trove of sensitive data, including login credentials, browser cookies, autofill data, and payment information extracted by unknown malware. Next is a piece about AI-based credit scores for digital financing in India and how they favour traditionally privileged groups. Finally, we have an article about how Google is using machine learning to help develop its next generation of AI chips, which would take humans months to complete but can be accomplished by AI in under six hours.
China’s New Power Play: More Control of Tech Companies’ Troves of Data
Shortly after rising to power in late 2012, Xi Jinping made his first company visit in his new job as China’s Communist Party chief, to Tencent Holdings Ltd. There, he raised a topic that has become both an opportunity and a challenge for his rule: the vast troves of personal data being gathered by the country’s technology companies.
Ten Years of Breaches in One Image
This is a map of the internet’s biggest sources of breached data, from June 2011 to today. The data is drawn from Troy Hunt’s Have I Been Pwned project (with minor adjustments), so you can click through to the site to see if you’re included. Each bubble represents a single breach, and as you scroll down, you’ll see them getting bigger and coming faster until the sheer volume is overwhelming.
How Data Science Gives New Insight Into Air Pollution in the US
“To do really important research in environmental policy,” said Francesca Dominici, “the first thing we need is data.” Dominici, a professor of biostatistics at the Harvard T.H. Chan School of Public Health and co-director of the Harvard Data Science Initiative, recently presented the Henry W. Kendall Memorial Lecture at MIT. She described how, by leveraging massive amounts of data, Dominici and a consortium of her colleagues across the nation are revealing, on a grand scale, the effects air pollution levels have on human health in the United States.
Mystery Malware Steals 26M Passwords from Millions of PCs. Are You Affected?
Researchers have discovered yet another massive trove of sensitive data, a dizzying 1.2TB database containing login credentials, browser cookies, autofill data, and payment information extracted by malware that has yet to be identified. In all, researchers from NordLocker said on Wednesday, the database contained 26 million login credentials, 1.1 million unique email addresses, more than 2 billion browser cookies, and 6.6 million files.
AI-based loan apps are booming in India, but some borrowers miss out
As the founder of a consumer rights non-profit in India, Karnav Shah is used to seeing sharp practices and disgruntled customers. But even he has been surprised by the sheer volume of complaints against digital lenders in recent years. While most of the grievances are about unauthorised lending platforms misusing borrowers’ data or harassing them for missed payments, others relate to high interest rates or loan requests that were rejected without explanation, Shah said.
Google is Using AI to Design its Next Generation of AI Chips More Quickly Than Humans Can
Google is using machine learning to help design its next generation of machine learning chips. The algorithm’s designs are “comparable or superior” to those created by humans, say Google’s engineers, but can be generated much, much faster. According to the tech giant, work that takes months for humans can be accomplished by AI in under six hours.
Source: https://mailchi.mp/zigram/data-asset-weekly-dispatch_14_june_2