Google News App
2020 challenged some of the best laid plans by enterprises. With nearly everything moving online, Covid-19 pushed forward years of digital transformation. DevOps was at the heart of this transformation journey. After all, delivering software quickly, reliably, and safely to meet the changing needs of customers was crucial to adapt to this new normal.
It is unlikely that the pace of modernization will slow down in 2021. As IT and business leaders further drive digital adoption within their organizations via DevOps, the need to quantify the business benefit from a digital transformation remains top of mind. A reliable model is imperative to drive the right level of investments and measure the returns. This is precisely why we wrote How to Measure ROI of DevOps Transformation. This white paper is backed with scientific studies conducted by DevOps Research and Assessment, DORA, with 31,000 professionals worldwide over 6 years to provide clear guidance based on impartial industry data. We found the financial savings of DevOps transformation varies from from $10M to $259M a year.
Looking beyond cost to value
The most innovative companies undertake their technology transformations with a focus on the value they can deliver to their customers. Hence, in addition to measuring cost savings, we show how DevOps done right can be a value driver and innovation engine. Let’s look deeper into how we quantify the cost and value-generating power of DevOps.
Here, we focus on quantifying the cost savings and efficiencies realized by implementing DevOps—for example, how an investment in DevOps reduces costs by cutting the time it takes to resolve outages and avoiding downtime as much as possible.
However, focusing solely on reducing costs can rarely yield systemic, long-term gains; thereby increasing the importance of going beyond cost-driven strategies. The cost savings achieved in year one “no longer count” beyond year two as the organization adjusts to a new baseline of costs and performance. Worse, only focusing on cost savings signals to technical staff their job is potentially at risk due to automation rather than being liberated from drudge work to better drive business growth. This leads to negative effects on morale and productivity.
There are two value drivers in a DevOps transformation, (1) improved efficiency through the reduction of unnecessary rework, and (2) the potential revenue gained by reinvesting the time saved in new offer capabilities.
Adding these cost and value driven categories together, IT and business decision makers can get an estimate of the potential value their organizations can expect to gain from a DevOps transformation. This helps justify the investment needed to implement the required changes. To quantify the impact, we leverage industry benchmark data across low, medium, high, and elite DevOps teams, as described by DORA in its annual Accelerate: State of DevOps report.
Combining cost and value
As an example, let’s consider the impact of a DevOps transformation on a large organization with 8,500 technical staff and a medium IT performer. Using the data gained from the DevOps report, we can calculate both the cost and value driven categories along with total impact.
While this example represents what a medium IT performer at a large organization might expect by investing in DevOps, companies of all sizes and performance profiles can leverage DevOps to drive performance. In the white paper, we calculate the impact of DevOps across organizations of different sizes—small, medium, and large—as well as across four distinct performance profiles—low, medium, high, elite.
There will be variation in these measurements based on your team’s current performance, compensation, change fail rate, benefits multiplier, and deployments per year, so we share our methodology in the white paper and invite you to customize the approach based on your specific needs and constraints.
Years of DORA research show that undertaking a technology transformation initiative can produce sizable returns for any organization. Our goal with the white paper is to provide IT and business decision makers an industry backed, data driven foundational basis for determining their investment in DevOps. Download the white paper here to calculate the impact of DevOps on your organization, while driving your digital transformation.
Open Sourcing Tilt Brush
Tilt Brush, Google’s virtual reality painting application, has collaborated with amazing creators over the years, many of whom were part of our Artist in Residence Program. We have tremendous pride for all those collaborations, and the best part has been watching our community learn from each other and develop their abilities over the years.
As we continue to build helpful and immersive AR experiences, we want to continue supporting the artists using Tilt Brush by putting it in your hands. This means open sourcing Tilt Brush, allowing everyone to learn how we built the project, and encouraging them to take it in directions that are near and dear to them.
Tilt Brush launched on the SteamVR platform for the HTC Vive VR headset in April 2016. It went on to help users create their artwork on every major VR platform, including the Oculus Rift, Windows Mixed Reality, Valve Index, PlayStation VR, and Oculus Quest VR headsets. Tilt Brush won dozens of awards, including the Unity Awards 2015: Best VR Experience, the Cannes Lions 2017 Gold Lion in Innovation, and the Oculus Quest award for Best of 2019: VR Creativity Tool of the Year, and was often featured on The Tonight Show Starring Jimmy Fallon. As we look back on Tilt Brush, we’re proud of what this creative application has achieved, and excited for where the community will take it.
The open source archive of the Tilt Brush code can be found at: https://github.com/googlevr/tilt-brush
In order to be able to release the Tilt Brush code as open source, there were a few things we had to change or remove due to licensing restrictions. In almost all cases, we documented the process for adding those features back in our comprehensive build guide. ‘Out of the box’, the code in the archive will compile a working version of Tilt Brush, requiring you only to add the SteamVR Unity SDK.
The currently published version of Tilt Brush will always remain available in digital stores for users with supported VR headsets. If you’re interested in creating your own Tilt Brush experience, please review the build guide and visit our github repo to access the source code.
Cheers, and happy painting from the Tilt Brush team!
By Tim Aidley, Software Engineer, and Jon Corralejo, Program Manager – Tilt Brush
Posted by Nikita Gandhi (Community Manager, GDG India), Nilay Yener (Program Manager, Flutter DevRel)
Happy New Year folks. It’s the perfect time of year to learn something new! Do you have an app idea you’ve been dreaming of over the holidays? If so, we have just the opportunity for you! Starting February 1st, leading up to our big event on March 3rd, join us for #30DaysOfFlutter to kickstart your learning journey and meet Flutter experts in the community. Whether you are building your first Flutter app or looking to improve your Flutter skills, we have curated content, code labs, and demos!
Flutter is Google’s open source UI toolkit for building beautiful, natively compiled applications for mobile, web, and desktop from a single codebase. It’s one of the fastest growing, most in-demand cross platform frameworks to learn and is used by freelance developers and large organizations around the world. Flutter uses the Dart language, so it will feel natural to many of you familiar with object-oriented languages.
Jump in, the water’s fine!
Along with the curated content, we will also have four live AskMeAnything sessions (#AMAs), where you can meet members of Google’s Flutter team and community. You can also join us on the FlutterDev Discord channel, where you can meet the other members of the community, ask and answer questions, and maybe make some new Flutter friends too!
Does this sound exciting? Visit the 30 Days of Flutter website to get more information and to register to join.
Your learning journey with Flutter for the month will look like this::
Receive curated content to your inbox. Meet other Flutter Devs on Discord. Attend Kick Off Webinar on February 1st.
Receive more content. Start building your first Flutter app. Join the webinar and ask your questions.
Work on your app and attend the 3rd webinar to ask your questions.
Complete your project and learn how to share it with the Flutter community.
Are you ready to learn one of the most in demand developer skills in the world?
Sign up to be a part of the journey and be sure to follow @FlutterDev on Twitter, to get updates about #30DaysOfFlutter.
Quick launch summary
Now when you’re sharing your screen, Chrome will automatically hide the content of web pop-up notifications. This includes notifications from Google Chat, email notifications, and other third party websites.
When you’re done sharing your screen, all muted notifications will be automatically displayed. Note that notifications are already muted when sharing a tab in Google Meet.
There has been a dramatic shift in how many of us work – now more than ever, we’re relying on the use of Google Meet and other screen sharing solutions. We hope this feature will reduce distractions and prevent sensitive or personal information from accidentally being displayed while sharing your screen.
- Admins: There is no admin control for this feature.
- End users: When sharing your screen, notification content will automatically be hidden. You can hover over a notification and select “Mute” to dismiss a notification and prevent any additional notifications. Or you can select “Show Content”, which will display content for the current and future notifications.
- Rapid Release and Scheduled Release domains: Full rollout (1–3 days for feature visibility) starting on January 25, 2021
- Available to Google Workspace Essentials, Business Starter, Business Standard, Business Plus, Enterprise Essentials, Enterprise Standard, and Enterprise Plus, as well as G Suite Basic, Business, Education, Enterprise for Education, and Nonprofits customers
- Available to users with personal Google Accounts
January 26 stirs a full spectrum of emotions for Australians across the country. It can be a day of pride or protest. A day to rejoice, reflect or reexamine our history. Or for some, perhaps it’s just any old Tuesday.
We are excited to announce the GKE CIS 1.1.0 Benchmark InSpec profile under an open source software license is now available on GitHub, which allows you to assess Google Kubernetes Engine (GKE) clusters against security controls recommended by CIS. You can validate the security posture of your GKE clusters using Chef InSpec™ by assessing their compliance against the Center for Internet Security (CIS) 1.1.0 benchmark for GKE.
Validating security compliance of GKE
GKE is a popular platform to run containerized applications. Many organizations have selected GKE for its scalability, self-healing, observability and integrations with other services on Google Cloud. Developer agility is one of the most compelling arguments for moving to a microservices architecture on Kubernetes, introducing configuration changes at a faster pace and demanding security checks as part of the development lifecycle.
Validating the security settings of your GKE cluster is a complex challenge and requires an analysis of multiple layers within your Cloud infrastructure:
GKE is a managed service on GCP, with controls to tweak the cluster’s behaviour which have an impact on its security posture. These Cloud resource configurations can be configured and audited via Infrastructure-as-Code (IaC) frameworks such as Terraform, the gcloud command line or the Google Cloud Console.
Application workloads are deployed on GKE by interacting via the Kubernetes (K8S) API. Kubernetes resources such as pods, deployments and services are often deployed from yaml templates using the command line tool kubectl.
InSpec for auditing GKE
InSpec is a popular DevSecOps framework that checks the configuration state of resources in virtual machines and containers, on Cloud providers such as Google Cloud, AWS, and Microsoft Azure. The InSpec GCP resource pack 1.8 (InSpec-GCP) provides a consistent way to audit GCP resources and can be used to validate the attributes of a GKE cluster against a desired state declared in code. We previously released a blog post on how to validate your Google Cloud resources with InSpec-GCP against compliance profiles such as the CIS 1.1.0 benchmark for GCP.
While you can use the InSpec-GCP resource pack to define the InSpec controls to validate resources against the Google Cloud API, it does not directly allow you to validate configurations of other relevant layers such as Kubernetes resources and config files on the nodes. Luckily, the challenge to audit Kubernetes resources with InSpec has already been solved by the inspec-k8s resource pack. Further, files on nodes can be audited using remote access via SSH. All together, we can validate the security posture of GKE holistically using the inspec-gcp and inspec-k8s resource packs as well as controls using the InSpec file resource executed in an ssh session.
Running the CIS for GKE compliance profile with InSpec
The script requires the cluster name (-c), ssh username (-u), private key file for ssh authentication (-k), cluster region or zone (-r or -z) and InSpec input file as required by the inspec.yml files in each profile (-i). As an example, the following line will run all three profiles to validate the compliance of cluster inspec-cluster in zone us-central1-a:
./run_profiles.sh -c inspec-cluster
Running InSpec profile inspec-gke-cis-gcp …
Profile: InSpec GKE CIS 1.1 Benchmark (inspec-gke-cis-gcp)
Target: gcp://<service account used for InSpec>
<lots of InSpec output omitted>
Profile Summary: 16 successful controls, 10 control failures, 2 controls skipped
Test Summary: 18 successful, 11 failures, 2 skipped
Stored report in reports/inspec-gke-cis-gcp_report.
Running InSpec profile inspec-gke-cis-k8s …
Profile: InSpec GKE CIS 1.1 Benchmark (inspec-gke-cis-k8s)
Target: kubernetes://<IP address of K8S endpoint>:443
<lots of InSpec output omitted>
Profile Summary: 9 successful controls, 1 control failure, 0 controls skipped
Test Summary: 9 successful, 1 failure, 0 skipped
Stored report in reports/inspec-gke-cis-gcp_report.
Running InSpec profile inspec-gke-cis-ssh on node <cluster node 1> …
Profile: InSpec GKE CIS 1.1 Benchmark (inspec-gke-cis-ssh)
Target: ssh://<username>@<cluster node 1>:22
<lots of InSpec output omitted>
Profile Summary: 10 successful controls, 5 control failures, 1 control skipped
Test Summary: 12 successful, 6 failures, 1 skipped
Stored report in reports/inspec-gke-cis-ssh_<cluster node 1>_report.
Analyze your scan reports
Once the wrapper script has completed successfully you should analyze the JSON or HTML reports to validate the compliance of your GKE cluster. One way to perform the analysis is to upload the collection of JSON reports of a single run from the reports folder to the open source InSpec visualization tool Heimdall Lite (GitHub) by the Mitre Corporation. An example of a compliance dashboard is shown below:
Try it yourself and run the GKE CIS 1.1.0 Benchmark InSpec profile in your Google Cloud environment! Clone the repository and follow the CLI example in the Readme file to run the InSpec profiles against your GKE clusters. We also encourage you to report any issues on GitHub that you may find, suggest additional features and to contribute to the project using pull requests. Also, you can read our previous blog post on using InSpec-GCP for compliance validations of your GCP environment.
Advertising is essential to keeping the web open for everyone, but the web ecosystem is at risk if privacy practices do not keep up with changing expectations. People want assurances that their identity and information are safe as they browse the web. That’s why Chrome introduced the Privacy Sandbox and, today, shared progress on their path to eliminate third-party cookies by replacing them with viable privacy-first alternatives, developed alongside ecosystem partners, that will help publishers and advertisers succeed while also protecting people’s privacy as they move across the web.
It might be hard to imagine how advertising on the web could be relevant, and accurately measured, without third-party cookies. When the Privacy Sandbox technology for interest-based advertising (FLoC) was first proposed last year, we started with the idea that groups of people with common interests could replace individual identifiers. Today, we’re releasing new data showing how this innovation can deliver results nearly as effective as cookie-based approaches. Technology advancements such as FLoC, along with similar promising efforts in areas like measurement, fraud protection and anti-fingerprinting, are the future of web advertising — and the Privacy Sandbox will power our web products in a post-third-party cookie world.
Federated Learning of Cohorts (FLoC) proposes a new way for businesses to reach people with relevant content and ads by clustering large groups of people with similar interests. This approach effectively hides individuals “in the crowd” and uses on-device processing to keep a person’s web history private on the browser.
By creating simulations based on the principles defined in Chrome’s FLoC proposal, Google’s ads teams have tested this privacy-first alternative to third-party cookies. Results indicate that when it comes to generating interest-based audiences, FLoC can provide an effective replacement signal for third-party cookies. Our tests of FLoC to reach in-market and affinity Google Audiences show that advertisers can expect to see at least 95% of the conversions per dollar spent when compared to cookie-based advertising. The specific result depends on the strength of the clustering algorithm that FLoC uses and the type of audience being reached.
We’re encouraged by what we’ve observed and the value that this solution offers to users, publishers and advertisers. Chrome intends to make FLoC-based cohorts available for public testing through origin trials with its next release in March and we expect to begin testing FLoC-based cohorts with advertisers in Google Ads in Q2. If you’d like to get a head start, you can run your own simulations (as we did) based on the principles outlined in this FLoC whitepaper.
The Privacy Sandbox also includes proposals for how marketers can create and deploy their own audiences, without the use of third-party cookies. One example is when advertisers want to reach prior visitors to their website via remarketing.
Over the last year, several members of the ad tech community have offered input for how this might work, including proposals from Criteo, NextRoll, Magnite and RTB House. Chrome has published a new proposal called FLEDGE that expands on a previous Chrome proposal (called TURTLEDOVE) and takes into account the industry feedback they’ve heard, including the idea of using a “trusted server” — as defined by compliance with certain principles and policies — that’s specifically designed to store information about a campaign’s bids and budgets. Chrome intends to make FLEDGE available for testing through origin trials later this year with the opportunity for ad tech companies to try using the API under a “bring your own server” model.
While proposals such as FLoC and FLEDGE explore privacy-preserving alternatives for reaching relevant audiences, there’s also work being done to help buyers decide how much to bid for ads seen by these audiences. We invite ad exchanges, demand-side platforms and advertisers to begin experimenting with the technology in the Privacy Sandbox. Feedback from these tests will help ensure that ad auctions will continue to function seamlessly when third-party cookies are deprecated.
Chrome has proposed a number of technologies within the Privacy Sandbox that would allow marketers, and partners working on their behalf, to measure campaign performance without third-party cookies. These proposals protect consumer privacy while supporting key advertiser requirements, such as event-level reporting that enables bidding models to recognize patterns in the data, and aggregate-level reporting which delivers accurate measurement over groups of users.
By using privacy-preserving techniques like aggregating information, adding noise, and limiting the amount of data that gets sent from the device, the proposed APIs report conversions in a way that protects user privacy. For example, an event-level iteration of the API is currently available in origin trials for measuring click-through conversions. It protects privacy by introducing noise and limiting the bits of conversion data that the API can send at a time. As a result, advertisers will have to prioritize which conversions are most important for their reporting needs.
Over the coming months, Google’s ads teams will continue evaluating how the proposed conversion measurement APIs can be used alongside Google’s measurement products to support use cases such as reporting view-through conversions, determining incrementality and reach as well as performing attribution. We recommend customers implement sitewide tagging with the global site tag or Google Tag Manager in order to minimize disruptions during this time. More decisions will have to be made before a prototype is built — including what the right level of noise should be and what’s the minimum number of conversions to include when sending an aggregate-level report — so we invite ad tech companies, publishers and advertisers to get involved in these discussions within the public forums.
Ad fraud prevention
The health of the ad-supported web depends on companies being able to distinguish actual visitors from fraudulent traffic. That’s why Chrome opened the Trust Token API for testing last July to help verify authentic traffic without exposing people’s identities in the process. And today, Chrome shared plans to start an origin trial in March with their next release to support a new type of Trust Token issuer that would improve the detection of fraud on mobile devices while safeguarding user privacy. Google’s ads teams will then start testing this feature with trusted users on mobile, and share feedback within the public forums based on the results.
An important goal of the Privacy Sandbox is developing technology to protect people from opaque or hidden techniques that share data about individual users and allow them to be tracked in a covert manner. One such tactic involves using a device’s IP address to try and identify someone without their knowledge or ability to opt out. Chrome recently published a new proposal, Gnatcatcher, for how someone’s IP address might be masked to protect that person’s identity without interfering with a website’s normal operations. This proposal will continue to be refined based on feedback from the web community.
The future of privacy on the web
Thanks to the initial FLoC results, ongoing development of the APIs and encouraging dialogue with the industry, we are more confident than ever that the Privacy Sandbox is the best path forward to improve privacy for web users while ensuring publishers can earn what they need to fund great content and advertisers can reach the right people for their products. For Google’s ads teams, the Privacy Sandbox technologies represent the future of how our ads and measurement products will work on the web. We encourage others to join us in defining this new approach which will create better experiences for consumers while providing more durable solutions for the ads industry.
As we move forward in 2021, you can expect to hear more about the progress being made in the Privacy Sandbox, including more opportunities for you to begin testing these new technologies in your campaigns. So, stay engaged in the public discussions about the Privacy Sandbox proposals in forums like the W3C’s Improving Web Advertising Business Group, or work with your technology partners to evaluate and experiment with the proposals that are already in origin trials. Together, we can reshape the web so that it works better for everyone.
A year ago we announced our intention to phase out third-party cookies and replace them with new browser features that are fundamentally more private. Since then, we’ve been working closely with the broader web community, including W3C, to design and implement new privacy-preserving technology, while also preserving the vitality and sustainability of the open web.
Today more than 30 different proposals have been offered by Chrome and others, including many that we believe are key to rendering third-party cookies obsolete. Early test results are also looking promising (see below)!
We are excited to continue testing this foundational tech with the active involvement of ecosystem partners and industry forums – all to move the web forward, together. What follows are key updates since our announcements last January and October.
Early results, and new proposals ready for testing
Five different Privacy Sandbox proposals are available for testing now – or will be very soon – in key areas like fraud detection, the tailoring of content, first-party treatment of a company’s owned and related domains, ads measurement, and a private-by-default way to request browser info. In fact, early testing of the Federated Learning of Cohorts (FloC) algorithm shows that new privacy-preserving ad solutions can be similarly effective to cookie-based approaches. This is great news for users, publishers, and advertisers – all of whom are critical for the future of the web – so we’re excited to carry this work forward.
Another important area of focus is user-facing controls. In particular, it’s clear that people will want to tune whether content is tailored to them (or not) – in addition to keeping their private info private. With the Chrome 90 release in April, we’ll be releasing the first controls for the Privacy Sandbox (first, a simple on/off), and we plan to expand on these controls in future Chrome releases, as more proposals reach the origin trial stage, and we receive more feedback from end users and industry. You can find a full update on all trials on our blog.
Involvement across the ecosystem
It’s great to see companies like Salesforce, White Ops, and Yahoo! JAPAN, starting (or preparing) to test initial solutions like Trust Tokens, First Party Sets, and Conversion Measurement. In fact, all developers have access to public Chrome experiments, and the latest guidance can be found on web.dev, so please do test and share feedback. This type of engagement helps ensure that the various APIs work as expected in real-world scenarios, so the more ecosystem participation, the better!
Building better. Together.
One of the things that makes the web so great is that it’s by and for all of us; this is a special quality amongst today’s platforms, and is definitely worth celebrating! It also creates complexity and trade-offs that we have to manage thoughtfully – and collectively – as we introduce new technology. That’s why we continue to engage in industry forums like the W3C, and are in active discussions with independent authorities – including privacy regulators and the UK’s Competition and Markets Authority – to help identify and shape the best approach for online privacy, and the industry and world as a whole.
So here’s to the users, and coders, and advertisers, and content creators (and so many others) who’ve made, and continue to make the platform what it is today. And here’s to coming together, in service of a more private web.
Posted by Justin Schuh – Director, Chrome Engineering
The pandemic has taken a devastating toll on communities worldwide. While there is much uncertainty still ahead, the development of multiple safe vaccines in such a short time gives us reason for hope. Now the work begins to ensure that everyone can benefit from this triumph of scientific achievement, and quickly.
During the pandemic, Google has helped people get the information they need to keep their families safe and healthy. We’ve supported small businesses and partnered with Apple to build exposure notification technology to fight the spread of COVID-19 around the world. Now, as public health authorities ramp up vaccination efforts, we’re finding more ways to help.
We recognize that getting vaccines to people is a complex problem to solve, and we’re committed to doing our part. Today we’re announcing that we’re providing more than $150 million to promote vaccine education and equitable distribution and making it easier to find locally relevant information, including when and where to get the vaccine. We’ll also be opening up Google spaces to serve as vaccination sites as needed.
$150 million to promote vaccine education and equitable access
Since the beginning of the pandemic, we’ve helped more than 100 government agencies and global non-governmental organizations run critical public service health announcements through our Ad Grants Crisis Relief program. Today, we’re announcing an additional $100 million in ad grants for the CDC Foundation, the World Health Organization, and nonprofits around the globe. We’ll invest another $50 million in partnership with public health agencies to reach underserved communities with vaccine-related content and information.
Our efforts will focus heavily on equitable access to vaccines. Early data in the U.S. shows that disproportionately affected populations, especially people of color and those in rural communities, aren’t getting access to the vaccine at the same rates as other groups. To help, Google.org has committed $5 million in grants to organizations addressing racial and geographic disparities in COVID-19 vaccinations, including Morehouse School of Medicine’sSatcher Health Leadership Institute and the CDC Foundation.
Highlighting authoritative information and local vaccination sites on Search & Maps
To help find accurate and timely information on vaccines, we’ve expanded our information panels on Search to more than 40 countries and dozens of languages, with more rolling out in the coming week. We’ll begin showing state and regional distribution information on Search so people can easily find when they are eligible to receive a vaccine. Soon we’ll launch a “Get The Facts” initiative across Google and YouTube to get authoritative information out to the public about vaccines.
Searches for “vaccines near me” have increased 5x since the beginning of the year and we want to make sure we’re providing locally relevant answers. In the coming weeks, COVID-19 vaccination locations will be available in Google Search and Maps, starting with Arizona, Louisiana, Mississippi and Texas, with more states and countries to come. We’ll include details like whether an appointment or referral is required, if access is limited to specific groups, or if it has a drive-through. We’re working with partners like VaccineFinder.org, an initiative of Boston Children’s Hospital, and other authoritative sources, such as government agencies and retail pharmacies, to gather vaccination location information and make it available.
Opening our spaces for vaccination clinics
To help with vaccination efforts, starting in the United States, we’ll make select Google facilities—such as buildings, parking lots and open spaces—available as needed. These sites will be open to anyone eligible for the vaccine based on state and local guidelines. We’ll start by partnering with health care provider One Medicaland public health authorities to open sites in Los Angeles and the San Francisco Bay Area in California; Kirkland, Washington; and New York City, with plans to expand nationally. We’re working with local officials to determine when sites can open based on local vaccine availability.
Using our technology to improve vaccine distribution
Google Cloud is helping healthcare organizations, retail pharmacies, logistics companies, and public sector institutions make use of innovative technologies to speed up delivery of vaccines. For example, logistics companies are using our AI to optimize trucking operations by adapting to traffic or inclement weather, and detect temperature fluctuations during transport. Once vaccines reach their destination, our tools help facilitate pre-screening, scheduling, and follow up. And our Intelligent Vaccine Impact Platform is helping states like New York and North Carolina manage distribution and forecast where vaccines, personal protective equipment, and hospital staffing will be most needed.
The COVID-19 pandemic has deeply affected every community all over the world. It’s also inspired coordination between public and private sectors, and across international borders, on a remarkable scale. We can’t slow down now. Getting vaccines to billions of people won’t be easy, but it’s one of the most important problems we’ll solve in our lifetimes. Google will continue to support in whatever way we can.
Posted by Cibu Johny, Software Engineer, Google Research and Saumya Dalal, Product Manager, Google Geo
Nearly 75% of India’s population — which possesses the second highest number of internet users in the world — interacts with the web primarily using Indian languages, rather than English. Over the next five years, that number is expected to rise to 90%. In order to make Google Maps as accessible as possible to the next billion users, it must allow people to use it in their preferred language, enabling them to explore anywhere in the world.
However, the names of most Indian places of interest (POIs) in Google Maps are not generally available in the native scripts of the languages of India. These names are often in English and may be combined with acronyms based on the Latin script, as well as Indian language words and names. Addressing such mixed-language representations requires a transliteration system that maps characters from one script to another, based on the source and target languages, while accounting for the phonetic properties of the words as well.
For example, consider a user in Ahmedabad, Gujarat, who is looking for a nearby hospital, KD Hospital. They issue the search query, કેડી હોસ્પિટલ, in the native script of Gujarati, the 6th most widely spoken language in India. Here, કેડી (“kay-dee”) is the sounding out of the acronym KD, and હોસ્પિટલ is “hospital”. In this search, Google Maps knows to look for hospitals, but it doesn’t understand that કેડી is KD, hence it finds another hospital, CIMS. As a consequence of the relative sparsity of names available in the Gujarati script for places of interest (POIs) in India, instead of their desired result, the user is shown a result that is further away.
To address this challenge, we have built an ensemble of learned models to transliterate names of Latin script POIs into 10 languages prominent in India: Hindi, Bangla, Marathi, Telugu, Tamil, Gujarati, Kannada, Malayalam, Punjabi, and Odia. Using this ensemble, we have added names in these languages to millions of POIs in India, increasing the coverage nearly twenty-fold in some languages. This will immediately benefit millions of existing Indian users who don’t speak English, enabling them to find doctors, hospitals, grocery stores, banks, bus stops, train stations and other essential services in their own language.
Transliteration vs. Transcription vs. Translation
Our goal was to design a system that will transliterate from a reference Latin script name into the scripts and orthographies native to the above-mentioned languages. For example, the Devanagari script is the native script for both Hindi and Marathi (the language native to Nagpur, Maharashtra). Transliterating the Latin script names for NIT Garden and Chandramani Garden, both POIs in Nagpur, results in एनआईटी गार्डन and चंद्रमणी गार्डन, respectively, depending on the specific language’s orthography in that script.
It is important to note that the transliterated POI names are not translations. Transliteration is only concerned with writing the same words in a different script, much like an English language newspaper might choose to write the name Горбачёв from the Cyrillic script as “Gorbachev” for their readers who do not read the Cyrillic script. For example, the second word in both of the transliterated POI names above is still pronounced “garden”, and the second word of the Gujarati example earlier is still “hospital” — they remain the English words “garden” and “hospital”, just written in the other script. Indeed, common English words are frequently used in POI names in India, even when written in the native script. How the name is written in these scripts is largely driven by its pronunciation; so एनआईटी from the acronym NIT is pronounced “en-aye-tee”, not as the English word “nit”. Knowing that NIT is a common acronym from the region is one piece of evidence that can be used when deriving the correct transliteration.
Note also that, while we use the term transliteration, following convention in the NLP community for mapping directly between writing systems, romanization in South Asian languages regardless of the script is generally pronunciation-driven, and hence one could call these methods transcription rather than transliteration. The task remains, however, mapping between scripts, since pronunciation is only relatively coarsely captured in the Latin script for these languages, and there remain many script-specific correspondences that must be accounted for. This, coupled with the lack of standard spelling in the Latin script and the resulting variability, is what makes the task challenging.
We use an ensemble of models to automatically transliterate from the reference Latin script name (such as NIT Garden or Chandramani Garden) into the scripts and orthographies native to the above-mentioned languages. Candidate transliterations are derived from a pair of sequence-to-sequence (seq2seq) models. One is a finite-state model for general text transliteration, trained in a manner similar to models used by Gboard on-device for transliteration keyboards. The other is a neural long short-term memory (LSTM) model trained, in part, on the publicly released Dakshina dataset. This dataset contains Latin and native script data drawn from Wikipedia in 12 South Asian languages, including all but one of the languages mentioned above, and permits training and evaluation of various transliteration methods. Because the two models have such different characteristics, together they produce a greater variety of transliteration candidates.
To deal with the tricky phenomena of acronyms (such as the “NIT” and “KD” examples above), we developed a specialized transliteration module that generates additional candidate transliterations for these cases.
For each native language script, the ensemble makes use of specialized romanization dictionaries of varying provenance that are tailored for place names, proper names, or common words. Examples of such romanization dictionaries are found in the Dakshina dataset.
Scoring in the Ensemble
The ensemble combines scores for the possible transliterations in a weighted mixture, the parameters of which are tuned specifically for POI name accuracy using small targeted development sets for such names.
For each native script token in candidate transliterations, the ensemble also weights the result according to its frequency in a very large sample of on-line text. Additional candidate scoring is based on a deterministic romanization approach derived from the ISO 15919 romanization standard, which maps each native script token to a unique Latin script string. This string allows the ensemble to track certain key correspondences when compared to the original Latin script token being transliterated, even though the ISO-derived mapping itself does not always perfectly correspond to how the given native script word is typically written in the Latin script.
In aggregate, these many moving parts provide substantially higher quality transliterations than possible for any of the individual methods alone.
The following table provides the per-language quality and coverage improvements due to the ensemble over existing automatic transliterations of POI names. The coverage improvement measures the increase in items for which an automatic transliteration has been made available. Quality improvement measures the ratio of updated transliterations that were judged to be improvements versus those that were judged to be inferior to existing automatic transliterations.
|* Unknown / No Baseline.|
As with any machine learned system, the resulting automatic transliterations may contain a few errors or infelicities, but the large increase in coverage in these widely spoken languages marks a substantial expansion of the accessibility of information within Google Maps in India. Future work will include using the ensemble for transliteration of other classes of entities within Maps and its extension to other languages and scripts, including Perso-Arabic scripts, which are also commonly used in the region.
This work was a collaboration between the authors and Jacob Farner, Jonathan Herbert, Anna Katanova, Andre Lebedev, Chris Miles, Brian Roark, Anurag Sharma, Kevin Wang, Andy Wildenberg, and many others.