Monthly:January 2021

#AndroidDevJourney spotlight – January edition

Posted by Luli Perkins, Developer Relations Program Manager

Header image with text saying Android Dev Journey

We kicked off the #AndroidDevJourney to give members of our community the opportunity to share their stories through our social platforms. Each Saturday from January through June we’ll feature a new developer on our Twitter account. We have received an overwhelming number of inspirational stories and hope you enjoy reading through the ones we’ve selected below.

For a chance to be featured in our February spotlight series, tweet us your story using #AndroidDevJourney.

Head shot of Niharika Arora

Niharika Arora

Tell me about your journey in becoming an Android Developer and how you got started.

My journey started in the field of Android when I was in my 4th year of undergrad studies. I got an internship in a startup named GreenAppleSolutions. There I got a chance to work on an Android project from scratch and luckily my first project went live on the Play Store. During this whole internship, I found Android so interesting because everything you code, you can see the results live in front of you on your device. I started loving Android and decided to take Android as my career path.

What’s one shortcut, tip, or hack you can’t live without?

I am a big fan of Android Lint, which has saved me many times from manually finding deprecated calls/APIs. It has also helped me in following the best practices and making my code more optimized, secure, and highly performant.

What’s the one piece of advice you wish someone would have given you when you started on your journey?

Actually, there are two,

  • Clearing a small doubt is equally as important even if you think that is a stupid one. Ask as many queries as you can till the time you are satisfied with the answer.
  • Reading tutorials is good, but start exploring the documentations in depth. Initially, it might look too much to start with, but it will build you up to be a good developer in the long run.
Head shot of Walmyr Carvalho

Walmyr Carvalho

Tell me about your journey in becoming an Android Developer and how you got started.

Funny thing! I started working with mobile on iOS, in 2010, but then in 2011 my college final project was an app for civil construction and nobody on the team had a Mac, so we did it for Android (We got a 10, btw!)! At that time I was teaching technology to some government people and wasn’t into coding that much, but then after project in 2011 I got my first job as Junior Android Developer and it got me so hooked on the platform that I couldn’t leave!

I was able to work with Java on Eclipse + ADT, Holo, ActionBarSherlock, the beginnings of Material Design and was attending Google I/O ’13 when Google announced Android Studio, which was a very humbling but insightful experience to me, not only because of the learning but also the people I met that helped me a lot as well!

Since then, I’ve been working with mobile and, mostly, with Android for more than 10 years now, helping a lot of Brazilian tech companies and unicorns with their Android projects and since 2016 I’m one of the Google Developer Experts for Android around here.

Also, I love development and design communities, so I try to be involved with that as much as I can. I’m a former organizer of GDG São Paulo and the creator and organizer of Kotlin Meetup São Paulo and Android Dev BR – the biggest brazilian/lusophone Android community in the world, with more than 7.500 members!

Lastly, I’m also involved with the national startup community, as a mentor for ACE Startups and Google For Startups Accelerator programs in Brazil.

What’s one Android development shortcut, tip, or hack you can’t live without?


There’s a simple but powerful shortcut on Android Studio that I use a lot, which is the multi-cursor occurrence selection, which can be achieved using Ctrl + G (macOS) / Alt + J (Windows + Linux) for incremental occurrences selection and/or Ctrl + Cmd + G / Shift + Ctrl + Alt + J to select all occurrences once. Seems silly, but this shortcut helps me so much to get going on my code, especially when it comes to refactoring. I use it everyday!

What’s the one piece of advice you wish someone would have given you when you started on your journey?

I think I would resume my advice in two words: learn and share.

Learn as much as you can, not only with the amazing content available on official documentation, and from the community, but also learn from your own mistakes through consistent practice. There’s a lot of content available for free on the internet, and also both Google and GDEs (Google Developer Experts) like me can get you going, so keep practicing and get your knowledge online!

And once you learn, share with other people! If I’m where I am today is because I was able to share what I couldn’t find when I was learning, so please, share your knowledge! The Android community is amazing and super helpful, you can reach literally the creators of the APIs and libraries you use on Twitter, Reddit and many other places. Write an article, record a podcast or a video, there are many formats that you could use.

The internet is such a powerful tool for learning and sharing and I really recommend you to do that there, and I’m definitely here to help if needed! 🙂

Head shot of Nate Washington

Nate Washington

Tell me about your journey in becoming an Android Developer and how you got started.

I became an Android developer in 2015, while working on my first business idea. I couldn’t afford to go back to school, so I decided to try my hand at starting a business instead. I launched a web application, but my customers insisted on having a native app for their needs as well. I originally looked for someone with more experience, but ultimately decided to just teach myself how to build an Android app. Fast forward to 2017, and my cofounder Christian and I launched the Android app for our company, Qoins, on the Google Play Store. Since then, we’ve served tens of thousands of Android customers and raised a few rounds of funding.

What’s one Android development shortcut, tip, or hack you can’t live without?

Being able to test my Android builds on virtual devices is a lifesaver. There are a lot of different scenarios to account for when building Android apps for thousands of different devices. Tools such as Firebase Test Labs, as well as other virtual device services allow me to create specific scenarios for hands-on testing that I can’t achieve with the physical Android devices that I own.

What’s the one piece of advice you wish someone would have given you when you started on your journey?

Making mistakes is OK; it’s all part of the process.

Headshot of Yuki Anzai

Yuki Anzai

Tell me about your journey in becoming an Android Developer and how you got started.

My journey began when I got my very first Android device, the HTC Magic, at Google Developer Day 2009. At that time, I was a college student and writing my personal application with JavaFX, so I had experience and familiarity with Java. Then I soon started to port my app to Android. After graduation I worked at a software company and wanted to develop Android apps as my job. But there seemed no opportunity at that company. So I created my own small company that is the agency to develop Android apps.

What’s one Android development shortcut, tip, or hack you can’t live without?

There are many. If I had to pick one, it would be Android Studio. I always appreciate the awesomeness of Android Studio because I started Android app development with Eclipse. (Also I can’t live without Kotlin, RecyclerView, ConstraintLayout …)

The shortcut of Android Studio that I can’t live without is Command + B (Declaration or Usages. This allows us to jump between the declaration and usages. It’s very useful to read source codes including Android platform and libraries codes.

What’s the one piece of advice you wish someone would have given you when you started on your journey?

Read official documents. Read source codes of platform, libraries that you use. One of the ways to accelerate learning is to create an app through first to end (until release to the market).

Don’t rely on libraries too much especially that affect the whole structure of your app. Your app might live longer than libraries.

Head shot of Madona Syombua

Madona Syombua

Tell me about your journey in becoming an Android Developer and how you got started.

My Android Journey started back in early 2014; before that, I worked as a junior Java developer for a small firm building inventory systems. However, that did not interest me, and I kept looking for something great to do with my Java knowledge. I bought my first phone, a Nokia, and saw apps in the phone and wondered how they made those apps. I researched and learned that apps were actually written in Java, and that’s how my journey began.

I recall building my first application, Simple Math, with only activities since fragments were not there; what an improvement we’ve had over the years. Simple Math had 500 downloads with a 4.5 rating, and this really motivated me to build more applications. I later won the Grow With Google Scholarship (2018), which boosted my career. During this one-year scholarship, I launched my second application, Budgeting Buddy, on the Google Play Store and has a 4.5 rating with over five thousand downloads. I currently work for Streem as an Android Engineer, and I indeed love how far Android has come and how the technology and maintenance have improved over the years. Especially the Emulator.

What’s one Android development shortcut, tip, or hack you can’t live without.

A shortcut I can’t live without is [options + Command + L ] and [Options + Command + O]; this really helps me during my pull request process. An amazing hack that I have learned to appreciate is the git local history option, WOW lifesaver. Sometimes you might forget what you had changed, but this hack always saves my life.

What’s the one piece of advice you wish someone would have given you when you started on your journey?

Actually, when I transitioned into mobile completely, I felt the learning curve was something I would have to accommodate In my life, which has really helped me a lot. Always staying in front of the game by always learning what is new, what is being recommended, and why it is needed. For instance, having Room was an amazing advancement, now dagger Hilt, and many more. So if I can turn this around and advise new developers, be ready to learn and you will enjoy Android Development.


The Android Developer community prides itself in its inclusivity and welcomes developers from all backgrounds and stages of life. If you’re feeling inspired and want to learn more about how to become a part of our community, here are a few resources to help get you started.

Dive into developer.android.com

Follow us on Twitter

Subscribe to our YouTube channel

GDG logo

The Google Developer Groups program gives developers the opportunity to meet local developers with similar interests in technology. A GDG meetup event includes talks on a wide range of technical topics where you can learn new skills through hands-on workshops.

Join a chapter near you here.

Women Techmakers logo

Founded in 2014, Google’s Women Techmakers is dedicated to helping all women thrive in tech through community, visibility and resources. With a member base of over 100,000 women developers, we’re working with communities across the globe to build a world where all women can thrive in tech.

Become a member here.

GD Experts logo

The Google Developers Experts program is a global network of highly experienced technology experts, influencers and thought leaders who actively support developers, companies and tech communities by speaking at events, publishing content, and building innovative apps.…

Google Workspace Updates Weekly Recap – January 29, 2021

New updates

There are no new updates to share this week. Please see below for a recap of published announcements.

Previous announcements

Oops! You may have noticed we missed publishing last week’s recap. To make up for it, below we’re including announcements published on the Workspace Updates blog in the past two weeks. Please refer to the original blog posts for complete details.
Resize the Chat and Rooms sections in Gmail on the web
You can now resize the Chat and Rooms sections in the left-side navigation of Gmail on the web. | Learn more.
Quickly navigate to active cells and ranges with the new range name box in Google Sheets
We’re adding a range name box, located to the left of the formula bar, to improve navigation in Google Sheets. | Learn more.
Enable offline support for Google Calendar on web from your computer
You can now enable offline support for Google Calendar on Google Chrome from your computer. | Learn more.
Better privacy when screen sharing with muted web notifications
Now when you’re sharing your screen, Chrome will automatically hide the content of web pop-up notifications. This includes notifications from Google Chat, email notifications, and other third party websites. | Learn more.
Out of office information will now display when replying to or mentioning a user in a Google Docs comment
In Google Docs, you’ll now see out of office information when replying to or mentioning other users in a comment. | Learn more.
Control background replacement in Google Meet with a new admin setting
We’re adding the ability for admins to enable or disable the use of custom or preset backgrounds in Google Meet for meetings organized by an organizational unit (OU) level. | Learn more.
Indirect membership visibility and membership hierarchy APIs now generally available
We’re making it easier to identify, audit, and understand indirect group membership via the Cloud Identity Groups API. Specifically, we’re making the membership visibility and membership hierarchy APIs generally available. | Available to Google Workspace Enterprise Standard and Enterprise Plus, as well as G Suite Enterprise for Education and Cloud Identity Premium customers. | Learn more.

Read More

Google Summer of Code 2021 is open for mentor organization applications!

GSoC logo

With the new year comes the start of our 17th edition of Google Summer of Code (GSoC)! Right now open source projects and organizations can apply to participate as mentoring organizations for the students in the 2021 program. GSoC is a global program that draws student developers (18 years old and over) from around the world to contribute to open source projects. This year, from June 7th to August 16th, each student will spend 10 weeks working on a coding project with the support of volunteer mentors from participating open source organizations.

Does your open source project want to learn more about becoming a mentoring organization? Visit the program site and read the mentor guide to learn about what it means to be a mentor organization, how to prepare your community (hint: have plenty of enthusiastic mentors!), creating appropriate project ideas (that will be ~175 hour projects for the student), and tips for preparing your application.

We welcome all types of organizations and are very eager to involve first-time organizations with a 2021 goal of accepting 40 new orgs. We encourage veteran organizations to refer other organizations they think would be a good fit to participate in GSoC as well.

Last year, 1,106 students completed the program under the guidance of over 2,000 mentors from 198 open source organizations. Many types of open source organizations are involved in GSoC, from small and medium sized open source organizations to larger, umbrella organizations with many sub-projects under them (Python Software Foundation, Apache Software Foundation, etc.). Some organizations are relatively young (less than 2 years old), while other organizations have been around for 20+ years.

You can apply to be a mentoring organization for GSoC starting today on the program site. The deadline to apply is February 19th at 19:00 UTC. We will publicly announce the organizations chosen for GSoC 2021 on March 9th.

Please visit the program site for more information on how to apply and review the detailed timeline of important deadlines. We also encourage you to check out the Mentor Guide and our short video on why open source projects want to be a part of the GSoC program.

Good luck to all open source mentoring organization applicants!

By Stephanie Taylor, Google Open Source

Read More

Data Driven Security Hardening in Android

Posted by Kevin Deus, Joel Galenson, Billy Lau and Ivan Lozano, Android Security & Privacy Team

The Android platform team is committed to securing Android for every user across every device. In addition to monthly security updates to patch vulnerabilities reported to us through our Vulnerability Rewards Program (VRP), we also proactively architect Android to protect against undiscovered vulnerabilities through hardening measures such as applying compiler-based mitigations and improving sandboxing. This post focuses on the decision-making process that goes into these proactive measures: in particular, how we choose which hardening techniques to deploy and where they are deployed. As device capabilities vary widely within the Android ecosystem, these decisions must be made carefully, guided by data available to us to maximize the value to the ecosystem as a whole.

The overall approach to Android Security is multi-pronged and leverages several principles and techniques to arrive at data-guided solutions to make future exploitation more difficult. In particular, when it comes to hardening the platform, we try to answer the following questions:

  • What data are available and how can they guide security decisions?
  • What mitigations are available, how can they be improved, and where should they be enabled?
  • What are the deployment challenges of particular mitigations and what tradeoffs are there to consider?

By shedding some light on the process we use to choose security features for Android, we hope to provide a better understanding of Android’s overall approach to protecting our users.

Data-driven security decision-making

We use a variety of sources to determine what areas of the platform would benefit the most from different types of security mitigations. The Android Vulnerability Rewards Program (VRP) is one very informative source: all vulnerabilities submitted through this program are analyzed by our security engineers to determine the root cause of each vulnerability and its overall severity (based on these guidelines). Other sources are internal and external bug-reports, which identify vulnerable components and reveal coding practices that commonly lead to errors. Knowledge of problematic code patterns combined with the prevalence and severity of the vulnerabilities they cause can help inform decisions about which mitigations are likely to be the most beneficial.


Types of Critical and High severity vulnerabilities fixed in Android Security Bulletins in 2019

Relying purely on vulnerability reports is not sufficient as the data are inherently biased: often, security researchers flock to “hot” areas, where other researchers have already found vulnerabilities (e.g. Stagefright). Or they may focus on areas where readily-available tools make it easier to find bugs (for instance, if a security research tool is posted to Github, other researchers commonly utilize that tool to explore deeper).

To ensure that mitigation efforts are not biased only toward areas where bugs and vulnerabilities have been reported, internal Red Teams analyze less scrutinized or more complex parts of the platform. Also, continuous automated fuzzers run at-scale on both Android virtual machines and physical devices. This also ensures that bugs can be found and fixed early in the development lifecycle. Any vulnerabilities uncovered through this process are also analyzed for root cause and severity, which inform mitigation deployment decisions.

The Android VRP rewards submissions of full exploit-chains that demonstrate a full end-to-end attack. These exploit-chains, which generally utilize multiple vulnerabilities, are very informative in demonstrating techniques that attackers use to chain vulnerabilities together to accomplish their goals. Whenever a researcher submits a full exploit chain, a team of security engineers analyzes and documents the overall approach, each link in the chain, and any innovative attack strategies used. This analysis informs which exploit mitigation strategies could be employed to prevent pivoting directly from one vulnerability to another (some examples include Address Space Layout Randomization and Control-Flow Integrity) and whether the process’s attack surface could be reduced if it has unnecessary access to resources.

There are often multiple different ways to use a collection of vulnerabilities to create an exploit chain. Therefore a defense-in-depth approach is beneficial, with the goal of reducing the usefulness of some vulnerabilities and lengthening exploit chains so that successful exploitation requires more vulnerabilities. This increases the cost for an attacker to develop a full exploit chain.

Keeping up with developments in the wider security community helps us understand the current threat landscape, what techniques are currently used for exploitation, and what future trends look like. This involves but is not limited to:

  • Close collaboration with the external security research community
  • Reading journals and attending conferences
  • Monitoring techniques used by malware
  • Following security research trends in security communities
  • Participating in external efforts and projects such as KSPP, syzbot, LLVM, Rust, and more

All of these data sources provide feedback for the overall security hardening strategy, where new mitigations should be deployed, and what existing security mitigations should be improved.

Reasoning About Security Hardening

Hardening and Mitigations

Analyzing the data reveals areas where broader mitigations can eliminate entire classes of vulnerabilities. For instance, if parts of the platform show a large number of vulnerabilities due to integer overflow bugs, they are good candidates to enable Undefined Behavior Sanitizer (UBSan) mitigations such as the Integer Overflow Sanitizer. When common patterns in memory access vulnerabilities appear, they inform efforts to build hardened memory allocators (enabled by default in Android 11) and implement mitigations (such as CFI) against exploitation techniques that provide better resilience against memory overflows or Use-After-Free vulnerabilities.

Before discussing how the data can be used, it is important to understand how we classify our overall efforts in hardening the platform. There are a few broadly defined buckets that hardening techniques and mitigations fit into (though sometimes a particular mitigation may not fit cleanly into any single one):

  • Exploit mitigations
    • Deterministic runtime prevention of vulnerabilities detects undefined or unexpected behavior and aborts execution when the behavior is detected. This turns potential memory corruption vulnerabilities into less harmful crashes. Often these mitigations can be enabled selectively and still be effective because they impact individual bugs. Examples include Integer Sanitizer and Bounds Sanitizer.
    • Exploitation technique mitigations target the techniques used to pivot from one vulnerability to another or to gain code execution. These mitigations theoretically may render some vulnerabilities useless, but more often serve to constrain the actions available to attackers seeking to exploit vulnerabilities. This increases the difficulty of exploit development in terms of time and resources. These mitigations may need to be enabled across an entire process’s memory space to be effective. Examples include Address Space Layout Randomization, Control Flow Integrity (CFI), Stack Canaries and Memory Tagging.
    • Compiler transformations that change undefined behavior to defined behavior at compile-time. This prevents attackers from taking advantage of undefined behavior such as uninitialized memory. An example of this is stack initialization.
  • Architectural decomposition
    • Splits larger, more privileged components into smaller pieces, each of which has fewer privileges than the original. After this decomposition, a vulnerability in one of the smaller components will have reduced severity by providing less access to the system, lengthening exploit chains, and making it harder for an attacker to gain access to sensitive data or additional privilege escalation paths.
  • Sandboxing/isolation
    • Related to architectural decomposition, enforces a minimal set of permissions/capabilities that a process needs to correctly function, often through mandatory and/or discretionary access control. Like architectural decomposition, this makes vulnerabilities in these processes less valuable as there are fewer things attackers can do in that execution context, by applying the principle of least privilege. Some examples are Android Permissions, Unix Permissions, Linux Capabilities, SELinux, and Seccomp.
  • Migrating to memory-safe languages
    • C and C++ do not provide memory safety the way that languages like Java, Kotlin, and Rust do. Given that the majority of security vulnerabilities reported to Android are memory safety issues, a two-pronged approach is applied: improving the safety of C/C++ while also encouraging the use of memory safe languages.

Enabling these mitigations

With the broad arsenal of mitigation techniques available, which of these to employ and where to apply them depends on the type of problem being solved. For instance, a monolithic process that handles a lot of untrusted data and does complex parsing would be a good candidate for all of these. The media frameworks provide an excellent historical example where an architectural decomposition enabled incrementally turning on more exploit mitigations and deprivileging.

Architectural decomposition and isolation of the Media Frameworks over time

Remotely reachable attack surfaces such as NFC, Bluetooth, WiFi, and media components have historically housed the most severe vulnerabilities, and as such these components are also prioritized for hardening. These components often contain some of the most common vulnerability root causes that are reported in the VRP, and we have recently enabled sanitizers in all of them.

Libraries and processes that enforce or sit at security boundaries, such as libbinder, and widely-used core libraries such as libui, libcore, and libcutils are good targets for exploit mitigations since these are not process-specific. However, due to performance and stability sensitivities around these core libraries, mitigations need to be supported by strong evidence of their security impact.

Finally, the kernel’s high level of privilege makes it an important target for hardening as well. Because different codebases have different characteristics and functionality, susceptibility to and prevalence of certain kinds of vulnerabilities will differ. Stability and performance of mitigations here are exceptionally important to avoid negatively impacting the user experience, and some mitigations that make sense to deploy in user space may not be applicable or effective. Therefore our considerations for which hardening strategies to employ in the kernel are based on a separate analysis of the available kernel-specific data.

This data-driven approach has led to tangible and measurable results. Starting in 2015 with Stagefright, a large number of Critical severity vulnerabilities were reported in Android’s media framework. These were especially sensitive because many of these vulnerabilities were remotely reachable. This led to a large architectural decomposition effort in Android Nougat, followed by additional efforts to improve our ability to patch media vulnerabilities quickly. Thanks to these changes, in 2020 we had no internet-reachable Critical severity vulnerabilities reported to us in the media frameworks.

Deployment Considerations

Some of these mitigations provide more value than others, so it is important to focus engineering resources where they are most effective. This involves weighing the performance cost of each mitigation as well as how much work is required to deploy it and support it without negatively affecting device stability or user experience.

Performance

Understanding the performance impact of a mitigation is a critical step toward enabling it. Adding too much overhead to some components or the entire system can negatively impact user experience by reducing battery life and making the device less responsive. This is especially true for entry-level devices, which should benefit from hardening as well. We thus want to prioritize engineering efforts on impactful mitigations with acceptable overheads.

When investigating performance, important factors include not just CPU time but also memory increase, code size, battery life, and UI jank. These factors are especially important to consider for more constrained entry-level devices, to ensure that the mitigations perform well across the entire Android ecosystem.

The system-wide performance impact of a mitigation is also dependent on where that mitigation is enabled, as certain components are more performance-sensitive than others. For example, binder is one of the most used paths for interprocess communication, so even small additional overhead could significantly impact user experience on a device. On the other hand, video players only need to ensure that frames are rendered at the source framerate; if frames are rendered much faster than the rate at which they are displayed, additional overhead may be more acceptable.

Benchmarks, if available, can be extremely useful to evaluate the performance impact of a mitigation. If there are no benchmarks for a certain component, new ones should be created, for instance by calling impacted codec code to decode a media file. If this testing reveals unacceptable overhead, there are often a few options to address it:

  • Selectively disable the mitigation in performance-sensitive functions identified during benchmarks.

Learning to Reason Over Tables from Less Data

Posted by Julian Eisenschlos AI Resident, Google Research, Zürich

The task of recognizing textual entailment, also known as natural language inference, consists of determining whether a piece of text (a premise), can be implied or contradicted (or neither) by another piece of text (the hypothesis). While this problem is often considered an important test for the reasoning skills of machine learning (ML) systems and has been studied in depth for plain text inputs, much less effort has been put into applying such models to structured data, such as websites, tables, databases, etc. Yet, recognizing textual entailment is especially relevant whenever the contents of a table need to be accurately summarized and presented to a user, and is essential for high fidelity question answering systems and virtual assistants.

In “Understanding tables with intermediate pre-training“, published in Findings of EMNLP 2020, we introduce the first pre-training tasks customized for table parsing, enabling models to learn better, faster and from less data. We build upon our earlier TAPAS model, which was an extension of the BERT bi-directional Transformer model with special embeddings to find answers in tables. Applying our new pre-training objectives to TAPAS yields a new state of the art on multiple datasets involving tables. On TabFact, for example, it reduces the gap between model and human performance by ~50%. We also systematically benchmark methods of selecting relevant input for higher efficiency, achieving 4x gains in speed and memory, while retaining 92% of the results. All the models for different tasks and sizes are released on GitHub repo, where you can try them out yourself in a colab Notebook.

Textual Entailment
The task of textual entailment is more challenging when applied to tabular data than plain text. Consider, for example, a table from Wikipedia with some sentences derived from its associated table content. Assessing if the content of the table entails or contradicts the sentence may require looking over multiple columns and rows, and possibly performing simple numeric computations, like averaging, summing, differencing, etc.

A table together with some statements from TabFact. The content of the table can be used to support or contradict the statements.

Following the methods used by TAPAS, we encode the content of a statement and a table together, pass them through a Transformer model, and obtain a single number with the probability that the statement is entailed or refuted by the table.

The TAPAS model architecture uses a BERT model to encode the statement and the flattened table, read row by row. Special embeddings are used to encode the table structure. The vector output of the first token is used to predict the probability of entailment.

Because the only information in the training examples is a binary value (i.e., “correct” or “incorrect”), training a model to understand whether a statement is entailed or not is challenging and highlights the difficulty in achieving generalization in deep learning, especially when the provided training signal is scarce. Seeing isolated entailed or refuted examples, a model can easily pick-up on spurious patterns in the data to make a prediction, for example the presence of the word “tie” in “Greg Norman and Billy Mayfair tie in rank”, instead of truly comparing their ranks, which is what is needed to successfully apply the model beyond the original training data.

Pre-training Tasks
Pre-training tasks can be used to “warm-up” models by providing them with large amounts of readily available unlabeled data. However, pre-training typically includes primarily plain text and not tabular data. In fact, TAPAS was originally pre-trained using a simple masked language modelling objective that was not designed for tabular data applications. In order to improve the model performance on tabular data, we introduce two novel pretraining binary-classification tasks called counterfactual and synthetic, which can be applied as a second stage of pre-training (often called intermediate pre-training).

In the counterfactual task, we source sentences from Wikipedia that mention an entity (person, place or thing) that also appears in a given table. Then, 50% of the time, we modify the statement by swapping the entity for another alternative. To make sure the statement is realistic, we choose a replacement among the entities in the same column in the table. The model is trained to recognize whether the statement was modified or not. This pre-training task includes millions of such examples, and although the reasoning about them is not complex, they typically will still sound natural.

For the synthetic task, we follow a method similar to semantic parsing in which we generate statements using a simple set of grammar rules that require the model to understand basic mathematical operations, such as sums and averages (e.g., “the sum of earnings”), or to understand how to filter the elements in the table using some condition (e.g.,”the country is Australia”). Although these statements are artificial, they help improve the numerical and logical reasoning skills of the model.

Example instances for the two novel pre-training tasks. Counterfactual examples swap entities mentioned in a sentence that accompanies the input table for a plausible alternative. Synthetic statements use grammar rules to create new sentences that require combining the information of the table in complex ways.

Results
We evaluate the success of the counterfactual and synthetic pre-training objectives on the TabFact dataset by comparing to the baseline TAPAS model and to two prior models that have exhibited success in the textual entailment domain, LogicalFactChecker (LFC) and Structure Aware Transformer (SAT). The baseline TAPAS model exhibits improved performance relative to LFC and SAT, but the pre-trained model (TAPAS+CS) performs significantly better, achieving a new state of the art.

We also apply TAPAS+CS to question answering tasks on the SQA dataset, which requires that the model find answers from the content of tables in a dialog setting. The inclusion of CS objectives improves the previous best performance by more than 4 points, demonstrating that this approach also generalizes performance beyond just textual entailment.

Results on TabFact (left) and SQA (right). Using the synthetic and counterfactual datasets, we achieve new state-of-the-art results in both tasks by a large margin.

Data and Compute Efficiency
Another aspect of the counterfactual and synthetic pre-training tasks is that since the models are already tuned for binary classification, they can be applied without any fine-tuning to TabFact. We explore what happens to each of the models when trained only on a subset (or even none) of the data. Without looking at a single example, the TAPAS+CS model is competitive with a strong baseline Table-Bert, and when only 10% of the data are included, the results are comparable to the previous state-of-the-art.

Dev accuracy on TabFact relative to the fraction of the training data used.

A general concern when trying to use large models such as this to operate on tables, is that their high computational requirements makes it difficult for them to parse very large tables. To address this, we investigate whether one can heuristically select subsets of the input to pass through the model in order to optimize its computational efficiency.

We conducted a systematic study of different approaches to filter the input and discovered that simple methods that select for word overlap between a full column and the subject statement give the best results. By dynamically selecting which tokens of the input to include, we can use fewer resources or work on larger inputs at the same cost. The challenge is doing so without losing important information and hurting accuracy. 

For instance, the models discussed above all use sequences of 512 tokens, which is around the normal limit for a transformer model (although recent efficiency methods like the Reformer or Performer are proving effective in scaling the input size). The column selection methods we propose here can allow for faster training while still achieving high accuracy on TabFact. For 256 input tokens we get a very small drop in accuracy, but the model can now be pre-trained, fine-tuned and make predictions up to two times faster. With 128 tokens the model still outperforms the previous state-of-the-art model, with an even more significant speed-up — 4x faster across the board.

Accuracy on TabFact using different sequence lengths, by shortening the input with our column selection method.

Using both the column selection method we proposed and the novel pre-training tasks, we can create table parsing models that need fewer data and less compute power to obtain better results.

We have made available the new models and pre-training techniques at our GitHub repo, where you can try it out yourself in colab. In order to make this approach more accessible, we also shared models of varying sizes all the way down to “tiny”. It is our hope that these results will help spur development of table reasoning among the broader research community.

Acknowledgements
This work was carried out by Julian Martin Eisenschlos, Syrine Krichene and Thomas Müller from our Language Team in Zürich. We would like to thank Jordan Boyd-Graber, Yasemin Altun, Emily Pitler, Benjamin Boerschinger, Srini Narayanan, Slav Petrov, William Cohen and Jonathan Herzig for their useful comments and suggestions.

Read More

Opening up Google’s Windows management tools

Managing a global fleet of Windows desktops, laptops, and servers for Google’s internal teams can be tricky, with a constant stream of new tools, high expectations, and stringent organizational needs for secure, code-based, scalable administration. Add in a globally distributed business and extended work-from-home requirements, and you have a recipe for potential trouble.

Today we’d like to walk you through some of the tools that the Windows Operations (WinOps) team uses at Google, and why we made (and open-sourced) them. Our team is constantly working to improve the process we use to manage our client fleet of laptops and desktops, and we’ve spent the past several years building open source, infrastructure-as-code tools to do just that. 

Windows

Now that we’re all working from home, these choices have enabled us to keep operating at scale remotely. Let’s dig into a few common Windows administrative challenges and how our open tools can help.

Challenges with scale

When you manage Windows in a large, globally distributed business environment, problems of scalability are front and center. Many popular administrative tools are GUI-based, which makes them easy to learn but difficult to scale and integrate. An administrator is often limited to the functionality built into the product by its vendor. Many times, core management suites lack qualities that we would consider critical in a reliable production environment, including the ability to: 

  • Peer review edits and to roll changes backward and forward on demand 
  • Implement platform testing, with support for automation pipelines 
  • Integrate seamlessly with tooling that also manages our other major platforms 

Because they rely on explicit network-level access, many of these products also depend heavily on a well defined corporate network, with clear distinctions between inside and outside .

At Google, we’ve been rethinking the way we manage Windows to address these limitations. We have built several tools that have helped us scale our environment globally and enabled us to consistently support Google employees, even when major unexpected events happen.

open source logo

Open source products are increasingly a key to our success. With the right knowledge and investment, open source tools can be extended and tailored to our environment in ways other applications simply can’t. Our designs also focus heavily on configuration as code, rather than user interfaces: Code-based infrastructure provides optimal integration with other internal systems, and enables us to manage our fleet in ways that are audited, peer reviewed, and thoroughly tested. Finally, the principles of the BeyondCorp model dictate that our management layer operates from anywhere in the world, rather than only inside the company’s private network.

Let’s dig into some of these tools, organized by what they help us get done.

Prepping Windows devices

Glazier, a tool for imaging, marked our team’s first foray into open source. This Python-based tool is at the core of our Windows device preparation process. It focuses on text-based configuration, which we can manage using a version control system. Much like code, we can use the flexible format to write automated tests for our configuration files, and trivially roll our deployments back and forward. File distribution is based around HTTPS, making it globally scalable and easy to proxy. Glazier supports modular actions (such as installing host certificates or gathering installation metrics), making it simple to extend with new capabilities over time as our environment changes.

Secure, modular imaging with Glazier helps prepare devices

Traditional imaging tends to rely heavily on network trust and presence inside a secure perimeter. Systems like PXE, Active Directory, Group Policy, and System Center Configuration Manager require you to either set up a device on a trusted network segment or have sensitive infrastructure exposed to the open internet. The Fresnel project addressed these limitations by making it possible to deliver boot media securely to our employees, anywhere in the world. We then integrated it with Glazier, enabling our imaging process to obtain critical files required to bootstrap an image from any network. The result was an imaging process that could be started and completed securely from anywhere, on any network, which aligns with our broader BeyondCorp security model. 

Fresnel enables imaging from any network in the world

The remote imaging and provisioning process included several other network trust dependencies that we had to resolve. Puppet provides the basis of our configuration management stack, while software delivery now leverages GooGet, an open source repository platform for Windows. GooGet’s open package format lends itself well to automation, while its simple, APT-like distribution mechanism is able to scale our package deployments globally. For both Puppet and GooGet the underlying use of HTTPS provides security and accessibility from any network. We also utilize OSQuery as a means of collecting distributed host state and inventory.

GooGet helps us automate package distribution and deployment

Our infrastructure still has dependencies on classic Active Directory (AD), and the domain join process was a particularly unique challenge for hosts that do not bootstrap from a trusted network. This led to the Splice project, which uses the Windows offline domain join API and Google Cloud services to enable domain joining from any network. Splice enables us to apply flexible business logic to the traditionally rigid domain join process. With the ability to implement custom authentication and authorization models, host inventory checks, and naming rules not typically available in AD environments, this project has given us the flexibility to extend our domain well beyond the classic network perimeter.

Splice helps us join new devices onto our Active Directory domain from anywhere

Maintaining our fleet

Deployment is only the beginning of the device lifecycle; we also need to be able to manage our active fleet and keep it secure.

big blue window

The Windows internal update mechanism is generally sufficient to keep the operating system patched, but we also wanted to be able to exercise some control over updates hitting our fleet. Specifically, we need the ability to rapidly deploy a critical update, or to postpone installing a problematic one. Enter Cabbie, a Windows service that builds upon Windows APIs to provide an additional management layer for patching. Cabbie gives us centralized control over the update agent on each machine in our fleet using our existing configuration management stack.

Centralized patch control using configuration management

We also have Windows servers to manage, and these hosts present unique challenges, distinct from those we face with our client fleet. One such challenge is how to schedule routine maintenance in a way that’s easily configurable, automated, and can be integrated with our various agents like Cabbie. This led to Aukera, a simple yet flexible service for defining recurring maintenance windows, establishing periods where a device can safely perform one or more automated activities that might otherwise be disruptive.

Building for the future

Our team was fortunate to have started  many of these projects well before the Spring of 2020, when many of us had to abruptly leave our offices behind. This was due, in part, to embracing the idea of building a Windows fleet for the future: one where every network is part of our company network. Whether our users are working at a business office, from home, or on a virtual machine in a Cloud data center, our tools must be flexible, scalable, reliable, and manageable to meet their needs.

Most of the challenges we’ve discussed here are not unique to Google. Companies of all shapes and sizes can benefit from increasing security, scalability, and flexibility in their networks. Our goal in opening up these projects, and sharing the principles behind them, is to assist our peers in the Windows community to build stronger solutions for their own businesses.

To learn more about our wider fleet management strategy and operations, read our “Fleet Management at Scale” white paper.

Read More

Quicksave: The latest from Google Play Pass

Google Play Pass helps you connect with awesome digital content: It’s your pass to hundreds of apps and games without ads and in-app purchases. It’s been a pretty busy year for Play Pass, so let’s take a moment to spotlight a few of the games and developers we think you’ll enjoy.

Program updates

This past year, Play Pass…

  • Celebrated its first birthday

  • Expanded to 42 countries

  • Added more than 300 new apps and games, including more than 100 teacher-approved kids’ titles

New games coming and recent additions

An image from the game Giant Dancing Plushies

Giant Dancing Plushies (Rogue Games, Inc.):

Help huge, adorable stuffed animals conquer the planet in this adorable (yet… terrifying) take on the rhythm game genre. Jam to the great in-game tracks or Kaiju it up to your own favorite music and get ready to stomp on the city! 

An image from the game Figment

Figment (Bedtime Digital Games):

Venture into the whimsical, dream-like world of the human mind. Solve puzzles to restore the peace and rediscover the courage that’s been lost–all while beating back the nightmarish fears that threaten to take over! If you’re looking for a mind-blowing weekend playlist, we recommend checking out Figment, Samorost 3Old Man’s Journey and The Gardens Between (all included with your Play Pass subscription). Can you identify the theme that links them?

The logo for the game The Legend of Bum-Bo

The Legend of Bum-Bo (The Label Limited):

Help Bum-Bo recover his lost coin in this edgy, puzzle-based, rogue-like prequel to the Binding of Isaac.  We won’t give away too much, but this combo of turn-based combat and poop (yes, poop) makes for one unforgettable gaming experience.

Titles we can’t get enough of

The logo from the game The Escapists

Everything by Team17: Bust out of a life behind bars, save some sheep and battle your way to worm domination. Almost every live Android title from this renowned publisher will be joining Play Pass. From the Escapists series, Flockers, to every Worms game, Team17 sure knows how to bring it and we’re all here for it.

The Escapists: Prison Escape

The Escapists 2: Pocket Breakout

and many more

An image from the game Basketball Club Story

Basketball Club Story (Kairosoft): Create your own basketball team, recruit a cast of zany players and compete against other teams in the league! You’re the coach taking the team to victory in this sim game from Japanese developer Kairosoft.  Keep an eye out for more from them soon.

An image from the video game Grand Mountain Adventure

Grand Mountain Adventure: Snowboard Premiere (Toppluva AB):The new Winter 2021 Expansion adds a bunch of new mountains and challenging excitement to this local multiplayer. If you can’t hit the slopes this winter, everything you need (including an avalanche of recently added content) is included in this game for you. Well… everything except the après-ski festivities.

The logo of the video game Hole Down

Holedown (grapefrukt games):Shoot balls, break blocks, upgrade all the things. How deep can you go? We love this game so much and are excited to have just welcomed another grapefrukt game (rymdkapsel) to Play Pass.

The logo of the video game Evoland

Evoland(Playdigious):Embark on an epic action/adventure journey with plenty of humor and nods to the classics. Upgrade your graphics and gameplay as you advance on your quest. As we know, every great title has a sequel, so make sure to be on the lookout for more Evoland coming to Play Pass.

Full list of additions since December 1, 2020:

Read More

Building the digital factory with SAP on Google Cloud

Manufacturers today face challenges on many fronts: increasingly demanding customer expectations, higher costs, sustainability concerns, and disruption—most recently and dramatically due to the global COVID-19 pandemic. But data can help companies navigate their way through the obstacle course of modern manufacturing. Manufacturing generates petabytes of useful data that can improve production yields, avert problems, and spot opportunities. But this data is only as useful as their ability to analyze and use it to make decisions. SAP customers need to merge their enterprise data with machine and IoT data to inform more insightful business intelligence, feed advanced automation, and build more innovative Industry 4.0 solutions.

How? By integrating SAP’s enterprise applications with Google Cloud’s artificial intelligence (AI), machine learning (ML) and data analytics capabilities. Google Cloud simplifies SAP deployment and offers a suite of applications that integrate with and enhance SAP. Manufacturers can bring together their operational and business data at scale to build an intelligent, connected digital factory. Here are just a few ways Google Cloud brings greater value to your organization’s SAP enterprise applications:

Cloud migration with minimal risk: SAP deployments can be complex, so moving to the cloud can seem daunting. Google Cloud’s tools and services help simplify and streamline the process with security capabilities and migration options. Manufacturers can take advantage of Google Cloud’s SAP-specific automated templates to deploy more quickly, consolidate SAP data within the cloud, and shrink time to value for AI- and ML-generated insights. TheCloud Acceleration Program for SAP customers leverages our network of partners with pre-built migration solutions and applications to make cloud transitions less risky and more efficient.

Data management, solved: Running SAP on Google Cloud gives manufacturers massive and highly flexible data storage without the cost of buying or maintaining infrastructure. Manufacturers can quickly gain fresh insight—not only from historic data, but also from real-time production, quality, and business data as well.

Multiple paths to the cloud: There are a lot of reasons to keep running legacy on-premises systems and multiple cloud deployments, including regulatory requirements and industry-specific needs. Manufacturers that rely on SAP for their core operations can take advantage of Google Cloud’s AI, ML, and analytics wherever their applications reside. Google Cloud’s hybrid and multicloud capabilities give manufacturers the strength of multiple cloud platforms, on-premises solutions, legacy providers, and a diversity of hardware. SAP manufacturers such as Kaeser Kompressoren are also taking advantage ofAnthos, an application platform that lets them easily migrate and modernize  legacy applications to the cloud, build new applications securely while staying in compliance, and gather and analyze their data. 

Rich data integration: Manufacturers can build their digital factory from the ground up using Google Cloud’s API toolkit. By consolidating data signals from tools across the Google Cloud portfolio, such as web search data, weather, maps, shopping, and more, companies can gain insight into production planning, customer needs, and other business processes. This includes AutoML Vision capabilities that allow SAP customers to automate visual inspection, identify defects early and reduce costs.  

Intelligent analytics: Google BigQuery allows manufacturers to quickly analyze  large amounts of data from a variety of sources, including SAP systems, production facilities, data lakes, sensors, and more to make more informed decisions. Manufacturers can train customized ML models for accurate forecasts withCloud AutoML, which uses machine learning to build data-driven predictive maintenance models. With AI-driven demand forecasts, businesses are able to reduce production delays, improve yield at their facilities, and free up working capital.

Accelerated innovation: Package your backend SAP data and functionality as API products using Google Cloud’s Apigee API management tool. Use these rich and valuable API products with AppSheet to allow non-developers to build innovative applications faster without coding.

Southwire takes the first step in its tech evolution with SAP on Google Cloud 

Southwire, one of the world’s leading manufacturers of wire and cable, tools and components, had a comprehensive plan to overhaul its SAP environment consisting of three key elements: upgrade the SAP environment to take advantage of the latest functionality; deploy SAP Business Warehouse on SAP HANA to accelerate reporting; and upgrade to the latest version of SAP Process Orchestration—an essential component that touches key manufacturing interfaces in all Southwire facilities.

“We wanted to be on a platform for SAP that was flexible, scalable, and secure; that we could count on to get up and running quickly,” says Dan Stuart, Senior Vice President of IT Services at Southwire. “We chose Google Cloud not only for those reasons, but also because we recognize that Google has other assets that we may be able to take advantage of down the line, such as technologies like artificial intelligence. There’s no shortage of areas where we think Google Cloud will come into play, and we intend to look at these things with an open mind to understand how we can leverage current investments to take our organization where we want to go.”

Getting the most value from manufacturing data 

In order to maximize the value of their data, it’s not enough for today’s manufacturers to connect disparate data streams. They must also extract insight, forecast accurately, and drive intelligent decisions. By running SAP on Google Cloud, manufacturers gain the best of both worlds: advanced digital manufacturing process control and ML and AI-driven analytics and automation.

To learn more about how Google Cloud can help your manufacturing operation leverage rich data to compete in Industry 4.0, read SAP on Google Cloud for Manufacturing, andwatch this video.

Read More

Indirect membership visibility and membership hierarchy APIs now generally available

Quick launch summary 

We’re making it easier to identify, audit, and understand indirect group membership via the Cloud Identity Groups API. Specifically, we’re making the membership visibility and membership hierarchy APIs generally available. These were previously available in beta. 
Using “nested” groups to manage access to content and resources can help decrease duplication, simplify administration, and centralize access management. However, nested groups can create a complex hierarchy that can make it hard to understand who ultimately has access and why. These APIs help provide all of the information you need to understand complex group structures and hierarchies, and can help you make decisions about who to add to or remove from your groups. 
See our beta announcement for more information and use cases for the APIs

Getting started 

Rollout pace 

Availability 

  • Available to Google Workspace Enterprise Standard and Enterprise Plus, as well as G Suite Enterprise for Education and Cloud Identity Premium customers. 
  • Not available to Google Workspace Essentials, Business Starter, Business Standard, Business Plus, and Enterprise Essentials, as well as G Suite Basic, Business, Education, and Nonprofits customers 

Resources 

Read More

Leveraging TensorFlow-TensorRT integration for Low latency Inference

Posted by Jonathan Dekhtiar (NVIDIA), Bixia Zheng (Google), Shashank Verma (NVIDIA), Chetan Tekur (NVIDIA)

TensorFlow-TensorRT (TF-TRT) is an integration of TensorFlow and TensorRT that leverages inference optimization on NVIDIA GPUs within the TensorFlow ecosystem. It provides a simple API that delivers substantial performance gains on NVIDIA GPUs with minimal effort. The integration allows for leveraging of the optimizations that are possible in TensorRT while providing a fallback to native TensorFlow when it encounters segments of the model that are not supported by TensorRT.

In our previous blog on TF-TRT integration, we covered the workflow for TensorFlow 1.13 and earlier releases. This blog will introduce TensorRT integration in TensorFlow 2.x, and demonstrate a sample workflow with the latest API. Even if you are new to this integration, this blog contains all the information you need to get started. Using the TensorRT integration has shown to improve performance by 2.4X compared to native TensorFlow inference on Nvidia T4 GPUs.

TF-TRT Integration

When TF-TRT is enabled, in the first step, the trained model is parsed in order to partition the graph into TensorRT-supported subgraphs and unsupported subgraphs. Then each TensorRT-supported subgraph is wrapped in a single special TensorFlow operation (TRTEngineOp). In the second step, for each TRTEngineOp node, an optimized TensorRT engine is built. The TensorRT-unsupported subgraphs remain untouched and are handled by the TensorFlow runtime. This is illustrated in Figure 1.

TF-TRT allows for leveraging TensorFlow’s flexibility while also taking advantage of the optimizations that can be applied to the TensorRT supported subgraphs. Only portions of the graph are optimized and executed with TensorRT, and TensorFlow executes the remaining graph.

In the inference example shown in Figure 1, TensorFlow executes the Reshape Op and the Cast Op. Then TensorFlow passes the execution of the TRTEngineOp_0, the pre-built TensorRT engine, to TensorRT runtime.

An example of graph partitioning and building TRT engine in TF-TRT
Figure 1: An example of graph partitioning and building TRT engine in TF-TRT

Workflow

In this section, we will take a look at the typical TF-TRT workflow using an example.

Workflow diagram when performing inference in TensorFlow only, and in TensorFlow-TensorRT using a converted SavedModel
Figure 2: Workflow diagram when performing inference in TensorFlow only, and in TensorFlow-TensorRT using a converted SavedModel

Figure 2 shows a standard inference workflow in native TensorFlow and contrasts it with the TF-TRT workflow. The SavedModel format contains all the information required to share or deploy a trained model. In native TensorFlow, the workflow typically involves loading the saved model and running inference using TensorFlow runtime. In TF-TRT, there are a few additional steps involved, including applying TensorRT optimizations to the TensorRT supported subgraphs of the model, and optionally pre-building the TensorRT engines.

First, we create an object to hold the conversion parameters, including a precision mode. The precision mode is used to indicate the minimum precision (for example FP32, FP16 or INT8) that TF-TRT can use to implement the TensorFlow operations. Then we create a converter object which takes the conversion parameters and input from a saved model. Note that in TensorFlow 2.x, TF-TRT only supports models saved in the TensorFlow SavedModel format.

Next, when we call the converter convert() method, TF-TRT will convert the graph by replacing TensorRT compatible portions of the graph with TRTEngineOps. For better performance at runtime, the converter build() method can be used for creating the TensorRT execution engine ahead of time. The build() method requires the input data shapes to be known before the optimized TensorRT execution engines are built. If input data shapes are not known then TensorRT execution engine can be built at runtime when the input data is available. The TensorRT execution engine should be built on a GPU of the same device type as the one on which inference will be executed as the building process is GPU specific. For example, an execution engine built for a Nvidia A100 GPU will not work on a Nvidia T4 GPU.

Finally, the TF-TRT converted model can be saved to disk by calling the save method. The code corresponding to the workflow steps mentioned in this section are shown in the codeblock below:

from tensorflow.python.compiler.tensorrt import trt_convert as trt

# Conversion Parameters
conversion_params = trt.TrtConversionParams(
precision_mode=trt.TrtPrecisionMode.<FP32 or FP16>)

converter = trt.TrtGraphConverterV2(
input_saved_model_dir=input_saved_model_dir,
conversion_params=conversion_params)

# Converter method used to partition and optimize TensorRT compatible segments
converter.convert()

# Optionally, build TensorRT engines before deployment to save time at runtime
# Note that this is GPU specific, and as a rule of thumb, we recommend building at runtime
converter.build(input_fn=my_input_fn)

# Save the model to the disk
converter.save(output_saved_model_dir)

As can be seen from the code example above, the build() method requires an input function corresponding to the shape of the input data. An example of an input function is shown below:

# input_fn: a generator function that yields input data as a list or tuple,
# which will be used to execute the converted signature to generate TensorRT
# engines. Example:
def my_input_fn():
# Let's assume a network with 2 input tensors. We generate 3 sets
# of dummy input data:
input_shapes = [[(1, 16), (2, 16)], # min and max range for 1st input list
[(2, 32), (4, 32)], # min and max range for 2nd list of two tensors
[(4, 32), (8, 32)]] # 3rd input list
for shapes in input_shapes:
# return a list of input tensors
yield [np.zeros(x).astype(np.float32) for x in shapes]

Support for INT8

Compared to FP32 and FP16, INT8 requires additional calibration data to determine the best quantization thresholds. When the precision mode in the conversion parameter is INT8, we need to provide an input function to the convert() method call. This input function is similar to the input function provided to the build() method. In addition, the calibration data generated by the input function passed to the convert() method should generate data that are statistically similar to the actual data seen during inference.

from tensorflow.python.compiler.tensorrt import trt_convert as trt

conversion_params = trt.TrtConversionParams(
precision_mode=trt.TrtPrecisionMode.INT8)

converter = trt.TrtGraphConverterV2(
input_saved_model_dir=input_saved_model_dir,
conversion_params=conversion_params)

# requires some data for calibration
converter.convert(calibration_input_fn=my_input_fn)

# Optionally build TensorRT engines before deployment.
# Note that this is GPU specific, and as a rule of thumb we recommend building at runtime
converter.build(input_fn=my_input_fn)

converter.save(output_saved_model_dir)

Example: ResNet-50

The rest of this blog will show the workflow of taking a TensorFlow 2.x ResNet-50 model, training it, saving it, optimizing it with TF-TRT and finally deploying it for inference. We will also compare inference throughputs using TensorFlow native vs TF-TRT in three precision modes, FP32, FP16, and INT8.

Prerequisites for the example :


Training ResNet-50 using the TensorFlow 2.x container:

First, the latest release of the ResNet-50 model needs to be downloaded from the TensorFlow github repository:

# Adding the git remote and fetch the existing branches
$ git clone --depth 1 https://github.com/tensorflow/models.git .

# List the files and directories present in our working directory
$ ls -al

rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:05 2020 ./
rwxrwxr-x user user 4 KiB Wed Sep 30 15:30:45 2020 ../
rw-rw-r-- user user 337 B Wed Sep 30 15:31:05 2020 AUTHORS
rw-rw-r-- user user 1015 B Wed Sep 30 15:31:05 2020 CODEOWNERS
rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:05 2020 community/
rw-rw-r-- user user 390 B Wed Sep 30 15:31:05 2020 CONTRIBUTING.md
rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:15 2020 .git/
rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:05 2020 .github/
rw-rw-r-- user user 1 KiB Wed Sep 30 15:31:05 2020 .gitignore
rw-rw-r-- user user 1 KiB Wed Sep 30 15:31:05 2020 ISSUES.md
rw-rw-r-- user user 11 KiB Wed Sep 30 15:31:05 2020 LICENSE
rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:05 2020 official/
rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:05 2020 orbit/
rw-rw-r-- user user 3 KiB Wed Sep 30 15:31:05 2020 README.md
rwxrwxr-x user user 4 KiB Wed Sep 30 15:31:06 2020 research/

As noted in the earlier section, for this example we will be using the latest TensorFlow container available in the Docker repository. The user does not need any additional installation steps as TensorRT integration is already included in the container. The steps to pull the container and launch it are as follows:

$ docker pull tensorflow/tensorflow:latest-gpu

# Please ensure that the Nvidia Container Toolkit is installed before running the following command
$ docker run -it --rm
--gpus="all"
--shm-size=2g --ulimit memlock=-1 --ulimit stack=67108864
--workdir /workspace/
-v "$(pwd):/workspace/"
-v "</path/to/save/data/>:/data/" # This is the path that will hold the training data
tensorflow/tensorflow:latest-gpu

From inside the container, we can then verify that we have access to the relevant files and the Nvidia GPU we would like to target:

# Let's first test that we can access the ResNet-50 code that we previously downloaded
$ ls -al
drwxrwxr-x 8 1000 1000 4096 Sep 30 22:31 .git
drwxrwxr-x 3 1000 1000 4096 Sep 30 22:31 .github
-rw-rw-r-- 1 1000 1000 1104 Sep 30 22:31 .gitignore
-rw-rw-r-- 1 1000 1000 337 Sep 30 22:31 AUTHORS
-rw-rw-r-- 1 1000 1000 1015 Sep 30 22:31 CODEOWNERS
-rw-rw-r-- 1 1000 1000 390 Sep 30 22:31 CONTRIBUTING.md
-rw-rw-r-- 1 1000 1000 1115 Sep 30 22:31 ISSUES.md
-rw-rw-r-- 1 1000 1000 11405 Sep 30 22:31 LICENSE
-rw-rw-r-- 1 1000 1000 3668 Sep 30 22:31 README.md
drwxrwxr-x 2 1000 1000 4096 Sep 30 22:31 community
drwxrwxr-x 12 1000 1000 4096 Sep 30 22:31 official
drwxrwxr-x 3 1000 1000 4096 Sep 30 22:31 orbit
drwxrwxr-x 23 1000 1000 4096 Sep 30 22:31 research

# Let's verify we can see our GPUs:
$ nvidia-smi

+-----------------------------------------------------------------------------+
| NVIDIA-SMI 450.XX.XX Driver Version: 450.XX.XX CUDA Version: 11.X |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 Tesla T4 On | 00000000:1A:00.0 Off | Off |
| 38% 52C P8 14W / 70W | 1MiB / 16127MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

We can now start training ResNet-50. To avoid spending hours training a deep learning model, this article will use the smaller MNIST dataset. However, the workflow will not change with a more state-of-the-art dataset like ImageNet.

# Install dependencies
$ pip install tensorflow_datasets tensorflow_model_optimization

# Download MNIST data and Train
$ python -m "official.vision.image_classification.mnist_main"
--model_dir=./checkpoints
--data_dir=/data
--train_epochs=10
--distribution_strategy=one_device
--num_gpus=1
--download

# Let’s verify that we have the trained model saved on our machine.
$ ls -al checkpoints/

-rw-r--r-- 1 root root 87 Sep 30 22:34 checkpoint
-rw-r--r-- 1 root root 6574829 Sep 30 22:34 model.ckpt-0001.data-00000-of-00001
-rw-r--r-- 1 root root 819 Sep 30 22:34 model.ckpt-0001.index
[...]
-rw-r--r-- 1 root root 6574829 Sep 30 22:34 model.ckpt-0010.data-00000-of-00001
-rw-r--r-- 1 root root 819 Sep 30 22:34 model.ckpt-0010.index
drwxr-xr-x 4 root root 4096 Sep 30 22:34 saved_model
drwxr-xr-x 3 root root 4096 Sep 30 22:34 train
drwxr-xr-x 2 root root 4096 Sep 30 22:34 validation

Obtaining a SavedModel to be used by TF-TRT

After training, Google’s ResNet-50 code exports the model in the SavedModel format at the following path: checkpoints/saved_model/.

The following sample code can be used as a reference in order to export your own trained model as a TensorFlow SavedModel.

import numpy as np

import tensorflow as tf
from tensorflow import keras

def get_model():
# Create a simple model.
inputs = keras.Input(shape=(32,))
outputs = keras.layers.Dense(1)(inputs)
model = keras.Model(inputs, outputs)
model.compile(optimizer="adam", loss="mean_squared_error")
return model

model = get_model()

# Train the model.
test_input = np.random.random((128, 32))
test_target = np.random.random((128, 1))
model.fit(test_input, test_target)

# Calling `save('my_model')` creates a SavedModel folder `my_model`.
model.save("my_model")

We can verify that the SavedModel generated by Google’s ResNet-50 script is readable and correct:

$ ls -al checkpoints/saved_model

drwxr-xr-x 2 root root 4096 Sep 30 22:49 assets
-rw-r--r-- 1 root root 118217 Sep 30 22:49 saved_model.pb
drwxr-xr-x 2 root root 4096 Sep 30 22:49 variables

$ saved_model_cli show --dir checkpoints/saved_model/ --tag_set serve --signature_def serving_default

MetaGraphDef with tag-set: 'serve' contains the following SignatureDefs:

The given SavedModel SignatureDef contains the following input(s):
inputs['input_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 28, 28, 1)
name: serving_default_input_1:0
The given SavedModel SignatureDef contains the following output(s):
outputs['dense_1'] tensor_info:
dtype: DT_FLOAT
shape: (-1, 10)
name: StatefulPartitionedCall:0
Method name is: tensorflow/serving/predict

Now that we have verified that our SavedModel has been properly saved, we can proceed with loading it with TF-TRT for inference.…