Google News App
A guest article by Ido Gus of CEVA
CEVA is a leading licensor of wireless connectivity and smart sensing technologies. Our products help OEMs design power-efficient, intelligent and connected devices for a range of end markets, including mobile, consumer, automotive, robotics, industrial and IoT.
In this article, we’ll describe how we used TensorFlow Lite for Microcontrollers (TFLM) to deploy a speech recognition engine and frontend, called WhisPro, on a bare-metal development board based on our CEVA-BX DSP core. WhisPro detects always-on wake words and speech commands efficiently, on-device.
Figure 1 CEVA Multi-microphone DSP Development Board
WhisPro is a speech recognition engine and frontend targeted to run on low power, resource constrained edge devices. It is designed to handle the entire data flow from processing audio samples to detection.
WhisPro supports two use cases for edge devices:
- Always-on wake word detection engine. In this use case, WhisPro’s role is to wake a device in sleep mode when a predefined phrase is detected.
- Speech commands. In this use case, WhisPro’s role is to enable a voice-based interface. Users can control the device using their voice. Typical commands can be: volume up, volume down, play, stop, etc.
WhisPro enables voice interface on any SoC that has a CEVA BX DSP core integrated into it, lowering entry barriers to OEMs and ODM interested in joining the voice interface revolution.
Originally, WhisPro was implemented using an in-house neural network library called CEVA NN Lib. Although that implementation achieved excellent performance, the development process was quite involved. We realized that, if we ported the TFLM runtime library and optimized it for our target hardware, the entire model porting process would become transparent and more reliable (far fewer lines of code would need to be written, modified, and maintained).
Building TFLM for CEVA-BX DSP Family
The first thing we had to do is to figure out how to port TFLM to our own platform. We found that following this porting to a new platform guide to be quite useful.
Following the guide, we:
- Verified DebugLog() implementation is supported by our platform.
- Created a TFLM runtime library project in CEVA’s Eclipse-based IDE:
- Created a new CEVA-BX project in CEVA’s IDE
- Added all the required source files to the project
- Built the TFLM runtime library for the CEVA-BX core.
This required the usual fiddling with compiler flags, including paths (not all required files are under the “micro” directory), linker script, and so on.
Model Porting Process
Our starting point is a Keras implementation of our model. Let’s look at the steps we took to deploy our model on our bare-metal target hardware:
Converted theTensorFlow model to TensorFlow Lite using the TF built-in converter:
$ python3 -m tensorflow_docs.tools.nbfmt [options] notebook.ipynb
converter = tf.lite.TFLiteConverter.from_keras_model(keras_model)
converter.experimental_new_converter = True
tflite_model = converter.convert()
$ python3 -m tensorflow_docs.tools.nbfmt [options] notebook.ipynb
converter.optimizations = [tf.lite.Optimize.OPTIMIZE_FOR_SIZE]
converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter.representative_dataset = representative_data_gen
Converted the TensorFlow Lite model to TFLM using xxd:
$ python3 -m tensorflow_docs.tools.nbfmt [options] notebook.ipynb
$> xxd –I model.tflite > model.cc
Here we found that some of the model layers (for example, GRU) were not properly supported (at the time) by TFLM. It is very reasonable to assume that, as TFLM continues to mature and Google and the TFLM community invest more in it, issues like this will become rarer.
In our case, though, we opted to re-implement the GRU layers in terms of Fully Connected layers, which was surprisingly easy.
The next step was to integrate the TFLM runtime library and the converted model into our existing embedded C frontend, which handles audio preprocessing and feature extraction.
Even though our frontend was not written with TFLM in mind, it was modular enough to allow easy integration by implementation of a single simple wrapper function, as follows:
- Linked the TFLM runtime library into our embedded C application (WhisPro frontend)
- Implemented a wrapper-over-setup function for mapping the model into a usable data structure, allocating the interpreter and tensors
- Implemented a wrapper-over-execute function for mapping data passed from the WhisPro frontend into tflite tensors used by the actual execute function
- Replaced the call to the original model execute function with a call to the TFLM implementation
The process we described is performed by two components:
- The microcontroller supplier, in this case, CEVA – is responsible for optimizing TFLM for its hardware architecture.
- The microcontroller user, in this case, CEVA WhisPro developer – is responsible for deploying a neural network based model, using an optimized TFLM runtime library, on the target microcontroller.
This work has proven the importance of the TFLM platform to us, and the significant value supporting TFLM can add to our customers and partners by enabling easy neural network model deployment on edge devices. We are committed to further support TFLM on the CEVA-BX DSP family by:
- Active contribution to the TFLM project, with the goal of improving layer coverage and overall platform maturity.
- Investing in TFLM operator optimization for execution on CEVA-BX cores, aiming for full coverage.
While the porting process had some bumps along the way, at the end it was a great success, and took about 4-5 days’ worth of work. Implementing a model in C from scratch, and handcrafting model conversion scripts from Python to C, could take 2-3 weeks (and lots of debugging).
CEVA Technology Virtual Seminar
To learn more, you are welcome to watch CEVA’s virtual seminar – Wireless Audio session, covering TFLM, amongst other topics.
Machine learning models are now being used to accomplish many challenging tasks. With their vast potential, ML models also raise questions about their usage, construction, and limitations. Documenting the answers to these questions helps to bring clarity and shared understanding. To help advance these goals, Google has introduced model cards.
Model cards aim to provide a concise, holistic picture of a machine learning model. To start, a model card explains what a model does, its intended audience, and who maintains it. A model card also provides insight into the construction of the model, including its architecture and the training data used. Not only does a model card include raw performance metrics– it puts a model’s limitations and risk mitigation opportunities into context. The Model Cards for Model Reporting research paper provides detailed coverage of model cards.
In this blog post, I hope to show how easy it is for you to create your own model card. We will use the popular scikit-learn framework, but the concepts you learn here will apply whether you’re using TensorFlow, PyTorch, XGBoost, or any other framework.
Model Card Toolkit
The Model Card Toolkit streamlines the process of creating a model card. The toolkit provides functions to populate and export a model card. The toolkit can also import model card metadata directly from TensorFlow Extended or ML Metadata, but that capability is not required. We will manually populate the model card fields in this blog post, and then export the model card to HTML for viewing.
Dataset and Model
We’ll be using the Breast Cancer Wisconsin Diagnostic Dataset. This dataset contains 569 instances with numeric measurements from digitized images. Let’s peek at a sample of the data:
We’ll use a GradientBoostedClassifier from scikit-learn to build the model. The model is a binary classifier, which means that it predicts whether an instance is of one type or another. In this case, we’re predicting whether a mass is benign or malignant, based on the provided measurements.
For example, you can see from the two plots below that the “mean radius” and “mean texture” features are correlated with the diagnosis (0 is malignant, 1 is benign). The model will be trained to optimize for the features, relationships between features, and weights of the features that predict best. For the purposes of this article, we won’t go into more depth on the model architecture.
Creating a Notebook
AI Platform Notebooks enable data scientists to prototype, develop, and deploy models in the cloud. Let’s start by creating a notebook in the Google Cloud console. You can create a new instance that already has scikit-learn, pandas, and other popular frameworks pre-installed with the “Python 2 and 3” instance. Once your notebook server is provisioned, select OPEN JUPYTERLAB to begin.
Since the dataset we’ll use only contains 569 rows, we can quickly train our model within the notebook instance. If you’re building a model based on a larger dataset, you can also leverage the AI Platform Training service to build your scikit-learn model, without managing any infrastructure. Also, when you’re ready to host your model, the AI Platform Prediction service can serve your scikit-learn model, providing a REST endpoint and auto-scaling if needed.
Loading the Sample Notebook
The Model Card Toolkit Github repository contains samples along with the project source code. Let’s start by cloning the repository by selecting Git > Clone a Repository in the JupyterLab menu.
Then, enter the repository URL (https://github.com/tensorflow/model-card-toolkit), and the contents will be downloaded into your notebook environment. Navigate through the directory structure:
model-card-toolkit/model_card_toolkit/documentation/examples, and open the Scikit-Learn notebook.
Creating a Model Card
Let’s get started! In this section, we’ll highlight key steps to create a model card. You can also follow along in the sample notebook, but that’s not required.
The first step is to install the Model Card Toolkit. Simply use the Python package manager to install the package in your environment:
pip install model-card-toolkit.
To begin creating a model card, you’ll need to initialize the model card, and then generate the model card toolkit assets. The scaffolding process creates an asset directory, along with a model card JSON file and a customizable model card UI template. If you happen to use ML Metadata Store, you can optionally initialize the toolkit with your metadata store, to automatically populate model card properties and plots. In this article, we will demonstrate how to manually populate that information.
Populating the Model Card
From this point, you can add a number or properties to the model card. The properties support nesting and a number of different data types, as you can see below, such as arrays of multiple values.
The model card schema is available for your reference. It defines the structure and accepted data types for your model card. For example, here’s a snippet that describes the name property we showed above.
Images need to be provided as base-64 encoded strings. The sample notebook provides some code that exports plots to PNG format, then encodes them as base-64 strings.
The final step is writing the model card contents back to the scaffolded JSON file. This process will first validate the properties you populated in the model card.
Generating a Model Card
We’re now ready to generate the model card. In this next code snippet, we’ll simply export the model card to HTML and display it within the notebook.
The HTML file is generated into your output directory specified when you initialize the toolkit. By default, the assets are created in a temp directory. Also, you can optionally pass in a custom UI template for your model card. If you choose to do that, the default template is a great starting point.
Let’s take a look at the results!
In this post, we’ve shown how to create your own model card using scikit-learn. In fact, you can apply what you’ve learned here to any machine learning framework, and if you use TensorFlow Extended (TFX), you can even populate the model card automatically.
Using the Model Card Toolkit, it’s as straightforward as populating model properties and exporting the result into an HTML template of your choice. You can use the sample notebook to see how it’s done.
We’ve also discussed how you can use the Google Cloud AI Platform to manage the full lifecycle of a scikit-learn model, from developing the model, to training it, and then serving it.
We hope that you’re able to use the platform to improve understanding of your own models in the future!
When developing a service to deploy on Kubernetes, do you sometimes feel like you’re more focused on your YAML files than on your application? When working with YAML, do you find it hard to detect errors early in the development process? We created Cloud Code to let you spend more time writing code and less time configuring your application, including authoring support features such as inline documentation, completions, and schema validation, a.k.a. “linting.”
But over the years, working with Kubernetes YAML has become increasingly complex. As Kubernetes has grown more popular, many developers have extended the Kubernetes API with new Operators and Custom Resource Definitions (CRDs). These new Operators and CRDs expanded the Kubernetes ecosystem with new functionality such as continuous integration and delivery, machine learning, and network security.Today, we’re excited to share authoring support for a broad set of Kubernetes CRDs, including:
Over 400 popular Kubernetes CRDs out of the box—up from just a handful
Any existing CRDs in your Kubernetes cluster
Any CRDs you add from your local machine or a URL
Cloud Code is a set of plugins for the VS Code and JetBrains Integrated Development Environments (IDEs), and provides everything you need to write, debug, and deploy your cloud-native applications. Now, its authoring support makes it easier to write, understand, and see errors in the YAML for a wide range of Kubernetes CRDs.
Cloud Code’s enhanced authoring support lets you leverage this custom Kubernetes functionality by creating a resource file that conforms to the CRD. For example, you might want to distribute your TensorFlow jobs across multiple pods in a cluster. You can do this by authoring a TFJob resource based on the TFJob CRD and applying it to the cluster where the KubeFlow operator can act on it.
Expanding built-in support
Cloud Code has expanded authoring support for over 400 of the most popular Kubernetes CRDs, including those used by Google Cloud and Anthos. This includes a wide variety of CRDs such as:
Works with your cluster’s CRDs
While Cloud Code now supports a breadth of popular public, Google Cloud, and Anthos CRDs, you may have your own private CRDs installed on a cluster. When you set a cluster running Kubernetes v1.16 or above as the active context in Cloud Code’s Kubernetes Explorer, Cloud Code automatically provides authoring support from the schema of all CRDs installed on the cluster.
Add your own CRDs
Despite the breadth of existing CRDs, you may find that there isn’t one that meets your needs. The solution here is to define your own CRD. For example, if you’re running your in-house CI system on Kubernetes, you could define your CRD schemas and allow developers to easily point Cloud Code to copies of those CRD schema files, to get authoring assistance for the resources in their IDEs.
To add a CRD to Cloud Code, just point Cloud Code to a local path or remote URL to a file defining the custom resource. The remote URL can be as simple as a direct link to a file in GitHub. If you want to learn more about custom resource definitions or create your own, take a look at this documentation page. Once configured, you get the same great inline documentation, completions, and linting from Cloud Code when editing that CRDs YAML files—and it’s super easy to set up in both VS Code and JetBrains IDEs.
Get started today
To see how Cloud Code can help you simplify your Kubernetes development, we invite you to try out the expanded Kubernetes CRD authoring support. To get started, simply install Cloud Code from the VS Code or JetBrains extension marketplaces, open a CRD’s YAML file, and start editing.
Once you have Cloud Code installed, you can also try Cloud Code’s fast, iterative development and debugging capabilities for your Kubernetes projects. Beyond Kubernetes, Cloud Code can also help you add Google Cloud APIs to your project or start developing a Cloud Run service with the Cloud Run Emulator.
At Google Cloud, we’re invested in building data analytics products with a customer-first mindset. Our engineering team is thrilled to share recent feature enhancements and product updates that we’ve made to help you get even more value out of BigQuery, Google Cloud’s enterprise data warehouse.
To support you in writing more efficient queries, BigQuery released a whole new set of SQL features. You can now easily add columns to your tables or delete the contents using new table operations, efficiently read from and write to external storage with new commands, and leverage new DATE and STRING functions. Learn more about these features in Smile everyday with new user-friendly SQL capabilities in BigQuery.
Read on to learn about other exciting recent additions to BigQuery and how they can help you speed up queries, efficiently organize and manage your data, and lower your costs.
Create partitions using flexible time units for fast and efficient queries
A core part of any data strategy is how you optimize your data warehouse for speed while reducing the amount of time spent looking at data you don’t need. Defining and implementing a clear table partitioning and clustering strategy is a great place to start.
We’re excited to announce that now you have even more granular control over your partitions with time unit partitioning available in BigQuery. Using flexible units of time (ranging from an hour to a year), you can organize time-based data to optimize how your users load and query data. BigQuery time-based partitioning now also supports the DATETIME data type, in addition to DATE and TIMESTAMP. Now you can easily aggregate global timestamp data without the need to convert data or add additional TIMESTAMP columns. With these updates, BigQuery now supports different time units on the same DATETIME data type, giving you the flexibility to write extremely fast and efficient queries.
Time unit partitioning is easily implemented using standard SQL DDL. For example, you can create a table named newtable that is hourly partitioned by the transaction_ts TIMESTAMP column using TIMESTAMP_TRUNC to delineate the timestamp at the hour mark:
As with other partitioning schemes in BigQuery, you can use clustering along with these new partitioning schemes to speed up the performance of your queries. Best part—there is no additional cost for the use of these new partitioning schemes; it’s included with the baseBigQuery pricing. These new partitioning schemes can help you lower query costs and allow you to match partitioning schemes available in traditional data warehouses for ease of migration.
Check out the demo video to see time unit partitioning in action, and read more in the BigQuery documentation.
Take advantage of expanded access to metadata via INFORMATION_SCHEMA
When our team was deciding where and how to expose rich metadata about BigQuery datasets, tables, views, routines (stored procedures and user-defined functions), schemas, jobs, and slots, the natural answer was BigQuery itself. You can use the
INFORMATION_SCHEMA views to access metadata on datasets, tables, views, jobs, reservations, and even streaming data!
Here are some quick code snippets of how people are asking questions of this metadata:
What are all the tables in my dataset?
How was this view defined again…?
You can also use
INFORMATION_SCHEMA.JOBS_TIMELINE_BY_* views to retrieve real-time BigQuery metadata by timeslice for the previous 180 days of currently running and/or completed jobs. The
INFORMATION_SCHEMA jobs timeline views are regionalized, so be sure to use a region qualifier in your queries, as shown in the examples below.
How many jobs are running at any given time?
Which queries used the most slot resources in the last day?
Of course, running the above query every day and monitoring the results can be tedious, which is why the BigQuery team created new publicly available report templates (more on that shortly).
Streamline the management of your BigQuery slots and jobs
If you’re using BigQuery reservations, monitoring the slot usage from each of your projects and teams can be challenging. We’re excited to announce BigQuery System Tables Reports, a solution that aims to help you monitor BigQuery flat-rate slot and reservation utilization by leveraging BigQuery’s underlying INFORMATION_SCHEMA views. These reports provide easy ways to monitor your slots and reservations by hour or day and review job execution and errors.
Check out the new Data Studio dashboard template to see these reports in action. Here’s a look at one option:
In addition to streamlining the management of BigQuery slots, we’re also working on making it easier for you to manage your jobs. For example, you can now use SQL to easily cancel your jobs with one simple statement:
The procedure returns immediately, and BigQuery cancels the job shortly afterwards.
Review all of the ways that you can manage your BigQuery jobs in the documentation.
Leverage advances in data governance to manage access to individual tables (and columns soon!)
Building on the introduction of data class-based access controls earlier this year, we have now launched Table ACLs into GA and added integration into Data Catalog. These new features provide you with individualized control over your tables and allow you to find and share data more easily via a data dictionary in the Data Catalog.
With Table ACLs, you no longer need access to the entire dataset to query a specific table. You can now set an Identity and Access Management (IAM) policy on an individual table or view in one of three easy ways:
bq set-iam-policycommand (
bqcommand-line tool version
Using the Google Cloud Console
For example, using the role BigQuery Data Viewer (
roles/bigquery.dataViewer), you can grant read access on an individual table, without the user needing access to the dataset the table belongs to. In addition, you can let users see which tables they have access to in a dataset by granting the role BigQuery Metadata Viewer (
roles/bigquery.metadataViewer) or the
bigquery.tables.list permission on a specific dataset.
And coming soon to GA is column-level security. With this feature (currently in beta), you will be able to restrict data access at the column level with just three steps:
Use schema annotations to assign a policy tag to each column for which you want to control access.
UseIdentity and Access Management (IAM) policies to restrict access to each policy tag. The policy will be in effect for each column belonging to the policy tag.
Both column-level and Table ACLs are exposed in Data Catalog searches.
Using policy-tag based search, you will be able to find specific data policed with column-level ACLs.
Data Catalog will also index all tables that you have access to (again, even if you don’t have access to the surrounding dataset).
In case you missed it:
The BigQuery Simba ODBC driver now leverages the optimized, synchronous API:Jobs.Query, for the majority of BI and analytical queries. In addition, the BigQuery Simba ODBC and JDBC drivers both now auto-enable thehigh-throughput read API for all queries on anonymous tables (check out necessary criteria). To enable these improvements, install the latest Simba driver versionshere.
Cloud Next OnAir ‘20 sessions included some great sessions on data analytics. Check them out to learn more about Best Practices from Experts to Maximize BigQuery Performance, Analytics in a Multi-Cloud World with BigQuery Omni, and Awesome New Features to Help You Manage BigQuery.
- Now in preview, the Cloud Console UI lets you opt in to search and autocomplete powered by Data Catalog. With this feature, you can search for all of your resources, even those outside your pinned projects. If you have a large number of resources, the overall performance of the Cloud Console is also improved with this option. Preview this feature by enabling it when prompted in the Cloud Console UI.
During my 10 years at The Boston Globe, we took a different path than most publishers. In 2011, we built a state-of-the-art website that was supported almost entirely by digital subscriptions, at a time when it was uncertain if readers would ever pay for news online. Today, digital subscriptions revenue alone more than covers for the cost of The Globe’s newsroom. Motivated to help others in the industry, I’ve since joined FTI Consulting, where I advise local publishers as they navigate the same existential business questions as we did.
Since I spent the last decade of my life pioneering a new business model for journalism, people often ask me if digital subscriptions can be a viable strategy for local news. Experience has taught me the simple answer is yes.
That’s why FTI Consulting partnered with the Google News Initiative on the GNI Digital Growth Program, a free program to help more small and mid-sized news publishers around the world achieve digital success. Reader revenue is central to the program’s curriculum, which is supported through playbooks, interactive exercises, workshops and labs. The workshops are currently guiding publishers through reader revenue models and sharing lessons learned from news organizations around the world, including those which have participated in the GNI Subscriptions Labs in North America, Latin America, Europe and Asia Pacific.
Publishers in the North America program, which FTI Consulting partnered on with Google and the Local Media Association, are great examples of the growth potential for local news. While the year-long Lab focused on helping news companies dive deep on digital subscriptions wrapped about six months ago, these publishers share a continued commitment to sustainability led by reader revenue.
As of August, the median year-over-year gain in digital subscriptions revenue for the participating publishers was 86 percent, compared to the industry average of 45 percent. While the business model of each organization is unique, these publishers achieved a higher level of performance by rallying around a shared set of digital metrics proven to make business impact.
For starters, they nearly doubled the conversion rates of their online readers to paying subscribers since the start of the Lab. They achieved this dramatic increase through a variety of tested tactics, including making digital subscriptions a priority, asking readers more often to subscribe and sign up for newsletters, improving and simplifying the online subscription checkout process and increasing website page speed.
What may be most impressive, though, is that these publishers were able to grow their overall number of subscribers without deep discounts or aggressive promotional offers. In fact, they raised the prices of their digital subscription offerings, even during today’s pandemic. The group’s average revenue per user (ARPU) has increased by 24 percent.
More important than the tactical improvements, publishers involved with the Lab have been able to create the “reader revenue machine,” a term that I use to describe a publisher that has put in place the mindset, processes, capabilities and technology to grow reader revenue continuously.
A good example of this transformation is The Portland Press Herald. In March, they launched “Digital Only Mondays,” which means they no longer print physical newspapers on Mondays. Within the first few weeks, this experiment increased the digital engagement of their print subscribers by 26 percent, and significantly reduced costs by eliminating one day of printing. The result as of July: The Press Herald was up 114 percent in digital subscriptions revenue compared to last year, and their staff gained the confidence to make bold decisions to support their digital transformation.
The reader revenue growth of The Press Herald is just one example of the bright spots I’ve seen shaping the future of local news. Through the GNI Digital Growth Program, I’m looking forward to working with Google to scale these insights and real world examples to help more publishers build sustainable business models for local journalism.
For those interested in learning more about the best practices that have helped publishers achieve digital subscriber success, join me, Google, other industry leaders and nearly 2,000 news organizations globally for our Reader Revenue workshops. Coming up next week, I will co-host a panel on this very topic. To sign up, visit the workshop registration page.
This year, searches for “how to vote” in the U.S. are higher than ever before. To make it easier to find information about how and where to vote—regardless of your preferred voting method—we’ve launched election-related features with information from trusted and authoritative organizations in Google Search.
Starting today, when you go to Google Search and Maps for information on where to vote, you’ll find helpful features that show the voting locations closest to you. On Google, search for things like “early voting locations” or “ballot drop boxes near me” and you’ll find details on where you can vote in person or return your mail-in ballot, whether you’re voting early or on Election Day. It will also include helpful reminders, like to bring your ballot completed and sealed.
Just as easily, you’ll soon be able to ask, “Hey Google, where do I vote?” and Google Assistant will share details on where to vote nearby on your Assistant-enabled phone, smart speaker or Smart Display.
After you’ve selected a voting location or ballot drop box in your area, you can click through from Google Search or Google Assistant to Google Maps for quick information about how far it is, how to get there, and voting hours. Similarly, if you search for your voting location in Google Maps you’ll have easy access to the feature in Search to help you confirm where you can cast your vote.
The official information in this feature comes from the Voting Information Project, a partnership between Democracy Works, a nonprofit, nonpartisan civic organization, and state election officials. Through the Voting Information Project, we plan to have more than 200,000 voting locations available across the country. For places where voting locations are not yet available, we’ll surface state and local election websites. As more locations become available, or if there are changes, we’ll continue to update the information provided across Google Search, Assistant and Maps.
If you’ve ever been in a rush to grab a quick bite, you may know the pain that comes along with finding out that the restaurant you chose is packed and there’s nowhere to sit. Or maybe you need to pick up just one item from the grocery store, only to find that the line is out the door—derailing your plans and causing you unnecessary stress.
These problems were top of mind when Google Maps launched popular times and live busyness information—helpful features that let you see how busy a place tends to be on a given day and time or in a specific moment. This information has become a powerful tool during the pandemic, making it easier to social distance because you know in advance how crowded a place will be. Today, we’ll take a closer look at how we calculate busyness information, while keeping your data private and secure.
Popular times: making sense of historical busyness information
To calculate busyness insights, we analyze aggregated and anonymized Location History data from people who have opted to turn this setting on from their Google Account. This data is instrumental in calculating how busy a place typically is for every hour of the week. The busiest hour becomes our benchmark—and we then display busyness data for the rest of the week relative to that hour.
For example, say there’s a new ice cream shop down the block known for its homemade waffle cones 🍦. With Location History insights, our systems know that the shop is consistently most crowded on Saturday afternoons at 4 p.m. As a result, popular times information for the rest of the week will be displayed as “Usually as busy as it gets” when it’s approximately as busy as Saturday at 4 p.m.,“Usually not too busy” when it is much less busy, and “Usually a little busy” for somewhere in between. This data can also show how long people tend to spend at the ice cream shop, which is handy if you’re planning a day with multiple activities and want to know how much time to allocate at each place.
Making adjustments in times of COVID
Google Maps’ popular times algorithms have long been able to identify busyness patterns for a place. With social distancing measures established and businesses adjusting hours or even closing temporarily due to COVID-19, our historical data was no longer as reliable in predicting what current conditions would be. To make our systems more nimble, we began favoring more recent data from the previous four to six weeks to quickly adapt to changing patterns for popular times and live busyness information–with plans to bring a similar approach to other features like wait times soon.
Real-time busyness information: how busy a place is right now
Busyness patterns identified by popular times are useful—but what about when there are outliers? Shelter in place orders made local grocery stores much more busy than usual as people stocked up on supplies. Warm weather can cause crowds of people to flock to a nearby park. And a new promotion or discount can drive more customers to nearby stores and restaurants.
Take the ice cream shop again. Say that, knowing that business is slow on Tuesdays, the shop owners decide to host a three scoop sundae giveaway on a Tuesday to promote their newest flavor—because everyone loves free ice cream! The promotion brings in more than double the amount of customers they typically see on that day and time. Gleaning insights from Location History data in real time, our systems are able to detect this spike in busyness and display it as “Live” data in Google Maps so you can see how busy the shop is right now—even if it varies drastically from its typical busyness levels.
Making sure your data is private, safe and secure
Privacy is a top priority when calculating busyness, and it’s woven into every step of the process. We use an advanced statistical technique known as differential privacy to ensure that busyness data remains anonymous. Differential privacy uses a number of methods, including artificially adding “noise” to our Location History dataset to generate busyness insights without ever identifying any individual person. And if our systems don’t have enough data to provide an accurate, anonymous busyness recommendation, we don’t publish it—which is why there are times when you may not see busyness information for a place at all.
Google Maps is always thinking about ways to solve the problems you face throughout your day, whether they’re big (like getting around safely) or small (like quickly snagging your favorite scoop of ice cream). Check out the Maps 101 series for other under-the-hood looks at your favorite features, with more deep dives coming soon.
Google Maps helps you navigate, explore, and get things done every single day. In this series, we’ll take a look under the hood at how Google Maps uses technology to build helpful products—from using flocks of sheep and laser beams to gather high-definition imagery to predicting traffic jams that haven’t even happened yet.
People turn to Google Maps for accurate, fresh information about what’s going on in the world—especially so during the pandemic. Activities like picking up something from the store, going for a walk, or grabbing a bite to eat now require a significant amount of planning and preparation. At any given time, you may be thinking: “Does the place I’m headed to have enough room for social distancing?” or “What safety precautions are being taken at my destination?”
Today, as part of our Search On event, we’re announcing new improvements to arm you with the information you need to navigate your world safely and get things done.
Make informed decisions with new live busyness updates
The ability to see busyness information on Google Maps has been one of our most popular features since it launched back in 2016. During the pandemic, this information has transformed into an essential tool, helping people quickly understand how busy a place is expected to be so they can make better decisions about where to go and when. In fact, as people around the world adjusted to life during the pandemic, they used popular times and live busyness information more. We saw engagement with these features rise 50 percent between March and May as more people tapped, scrolled and compared data to find the best days and times to go places.
We’ve been expanding live busyness information to millions of places around the world, and are on track to increase global coverage by five times compared to June 2020. This expansion includes more outdoor areas, like beaches and parks, and essential places, like grocery stores, gas stations, laundromats and pharmacies. Busyness information will surface in directions and right on the map—so you don’t even need to search for a specific place in order to see how busy it is. This will soon be available to Android, iOS and desktop users worldwide.
A new way to source up-to-date business information
It’s hard to know how a business’ offerings have changed during the pandemic. To help people find the freshest business information possible, we’ve been using Duplex conversational technology to call businesses and verify their information on Maps and Search. Since April 2020, this information has helped make more than 3 million updates, including updated hours of operation, delivery and pickup options, and store inventory information for in-demand products such as face masks, hand sanitizer and disinfectant. To date, these updates have been viewed more than 20 billion times.
Important health and safety information about businesses is now front and center on Maps and Search. You can quickly know what safety precautions a business is taking, such as if they require customers to wear masks and make reservations, if there’s plexiglass onsite, or if their staff takes regular temperature checks. This information comes directly from businesses, and soon Google Maps users will also be able to contribute this useful information.
See helpful information right from Live View
Getting around your city looks different these days. The stakes are higher due to safety concerns, and it’s important to have all the information you need before deciding to visit a place. In the coming months, people using Android and iOS devices globally will be able to use Live View, a feature that uses AR to help you find your way, to learn more about a restaurant, store or business.
Say you’re walking around a new neighborhood, and one boutique in particular captures your attention. You’ll be able to use Live View to quickly learn if it’s open, how busy it is, its star rating, and health and safety information if available,
Do you know that song that goes, “da daaaa da da daaaa na naa naa ooohh yeah”? Or the one that starts with the guitar chords going, “da na na naa”? We all know how frustrating it is when you can’t remember the name of a song or any of the words but the tune is stuck in your head. Today at Search On, we announced that Google can now help you figure it out—no lyrics, artist name or perfect pitch required.
Hum to search for your earworm
Starting today, you can hum, whistle or sing a melody to Google to solve your earworm. On your mobile device, open the latest version of the Google app or find your Google Search widget, tap the mic icon and say “what’s this song?” or click the “Search a song” button. Then start humming for 10-15 seconds. On Google Assistant, it’s just as simple. Say “Hey Google, what’s this song?” and then hum the tune. This feature is currently available in English on iOS, and in more than 20 languages on Android. And we hope to expand this to more languages in the future.
After you’re finished humming, our machine learning algorithm helps identify potential song matches. And don’t worry, you don’t need perfect pitch to use this feature. We’ll show you the most likely options based on the tune. Then you can select the best match and explore information on the song and artist, view any accompanying music videos or listen to the song on your favorite music app, find the lyrics, read analysis and even check out other recordings of the song when available.
How machines learn melodies
So how does it work? An easy way to explain it is that a song’s melody is like its fingerprint: They each have their own unique identity. We’ve built machine learning models that can match your hum, whistle or singing to the right “fingerprint.”
When you hum a melody into Search, our machine learning models transform the audio into a number-based sequence representing the song’s melody. Our models are trained to identify songs based on a variety of sources, including humans singing, whistling or humming, as well as studio recordings. The algorithms also take away all the other details, like accompanying instruments and the voice’s timbre and tone. What we’re left with is the song’s number-based sequence, or the fingerprint.
We compare these sequences to thousands of songs from around the world and identify potential matches in real time. For example, if you listen to Tones and I’s “Dance Monkey,” you’ll recognize the song whether it was sung, whistled, or hummed. Similarly, our machine learning models recognize the melody of the studio-recorded version of the song, which we can use to match it with a person’s hummed audio.
This builds on the work of our AI Research team’s music recognition technology. We launched Now Playing on the Pixel 2 in 2017, using deep neural networks to bring low-power recognition of music to mobile devices. In 2018, we brought the same technology to the SoundSearch feature in the Google app and expanded the reach to a catalog of millions of songs. This new experience takes it a step further, because now we can recognize songs without the lyrics or original song. All we need is a hum.
So next time you can’t remember the name of some catchy song you heard on the radio or that classic jam your parents love, just start humming. You’ll have your answer in record time.