Monthly:February 2021

21 Google Cloud tools, each explained in under 2 minutes

Need a quick overview of Google Cloud core technologies? Quickly learn these 21 Google Cloud products—each explained in under two minutes.

1. BigQuery in a minute

Storing and querying massive datasets can be time consuming and expensive without the right infrastructure. This video gives you an overview of BigQuery, Google’s fully-managed data warehouse. Watch to learn how to ingest, store, analyze, and visualize big data with ease.

Storing and querying massive datasets can be time consuming and expensive without the right infrastructure. In this episode of Cloud Bytes, we give you an overview of BigQuery, Google’s fully-managed data warehouse. Watch to learn how to ingest, store, analyze, and visualize big data with ease!

BigQuery in a minute

2. Filestore in a minute

Filestore is a managed file storage service that provides a consistent view of your file system data and steady performance over time. In this video, we give you an overview of Filestore, showing you what it does and how you can use it for your developer projects.

Filestore is a managed file storage service that provides a consistent view of your file system data and steady performance over time. In this video, we give you an overview of Filestore, showing you what it does and how you can use it for your developer projects.

Filestore in a minute

3. Local SSD in a minute

Need a tool that gives you extra storage for your VM instances? This video explains what a Local SSD is and the different use cases for it. Watch to learn if this ephemeral storage option fits best with your developer projects.

Need a tool that gives you extra storage for your VM instances? This video explains what a Local SSD is and the different use cases for it. Watch to learn if this ephemeral storage option fits best with your developer projects.

Local SSD in a minute

4. Persistent Disk in a minute

What are persistent disks? How can they help when working with virtual machines? This video gives you a snackable synopsis of what Persistent Disk is and how you can use it as an affordable, reliable way to store and manage the data for your virtual machines.

What are persistent disks? How can they help when working with virtual machines? This video gives you a snackable synopsis of what Persistent Disk is and how you can use it as an affordable, reliable way to store and manage the data for your virtual machines.

Persistent Disk in a minute

5. Cloud Storage in a minute

Managing file storage for applications can be complex, but it doesn’t have to be. In this video, learn how Cloud Storage allows enterprises and developers alike to store and access their data seamlessly without compromising security or hindering scalability. 

Managing file storage for applications can be complex, but it doesn’t have to be. In this video, learn how Cloud Storage allows enterprises and developers alike to store and access their data seamlessly without compromising security or hindering scalability.

Cloud Storage in a minute

6. Anthos in a minute

Modernizing your applications while keeping complexity to a minimum is no easy feat. In this video, learn why Anthos is a great platform for providing greater observability, managing configurations, and securing multi and hybrid cloud applications.

Modernizing your applications while keeping complexity to a minimum is no easy feat. In this video, learn why Anthos is a great platform for providing greater observability, managing configurations, and securing multi and hybrid cloud applications.

Anthos in a minute

7. Google Kubernetes Engine in a minute

In this video, watch and learn how Google Kubernetes Engine (GKE), our managed environment for deploying, managing, and scaling containerized applications, can increase developer productivity, simplify platform operations, and provide greater observability.

In this video, watch and learn how Google Kubernetes Engine (GKE), our managed environment for deploying, managing, and scaling containerized applications using Google infrastructure, can increase developer productivity, simplify platform operations, and provide greater observability.

Google Kubernetes Engine in a minute

8. Compute Engine in a minute

How do you migrate existing VM workloads to the cloud? In this video, get a quick overview of Compute Engine and how it can help you seamlessly migrate your workloads.

How do you migrate existing VM workloads to the cloud? In this video, get a quick overview of Compute Engine and how it can help you seamlessly migrate your workloads to the Cloud.

Compute Engine in a minute

9. Cloud Run in a minute

What is Cloud Run? How does it help you build apps? In this video, get an overview of Cloud Run, a fully managed serverless platform that allows you to easily create applications seamlessly. Watch to learn how you can use Cloud Run for your developer projects.

What is Cloud Run? How does it help you build apps? In this video, get an overview of Cloud Run, a fully managed serverless platform that allows you to easily create applications seamlessly. Watch to learn how you can use Cloud Run for your developer projects.

Cloud Run in a minute

10. App Engine in a minute

Learn how this serverless application platform allows you to write your code in any supported language, run custom containers with the framework of your choice, and easily deploy and run your code in the cloud. 

Learn how this serverless application platform allows you to write your code in any supported language, run custom containers with the framework of your choice, and easily deploy and run your code in the cloud.

App Engine in a minute

11. Cloud Functions in a minute

Get a quick overview of Cloud Functions, our scalable pay-as-you-go functions as a service (FaaS) to run your code with zero server management.

Get a quick overview of Cloud Functions, our scalable pay-as-you-go functions as a service (FaaS) to run your code with zero server management.

Cloud Functions in a minute

12. Firestore in a minute

Cloud Firestore is a NoSQL document database that lets you easily store, sync, and query data for your mobile and web apps, at global scale. In this video, learn how use Firestore and discover features that simplify app development without compromising security.

Cloud Firestore is a NoSQL document database that lets you easily store, sync, and query data for your mobile and web apps, at global scale. In this video, we show you how to use Firestore and showcase its features that simplify app development without compromising security.

Firestore in a minute

13. Cloud Spanner in a minute

Cloud Spanner is a fully managed relational database with unlimited scale, strong consistency, and up to 99.999% availability. In this video, you’ll learn how Cloud Spanner can help you create time-sensitive, mission critical applications at scale.

Cloud Spanner is a fully managed relational database with unlimited scale, strong consistency, and up to 99.999% availability. In this video, you'll learn how Cloud Spanner can help you create time-sensitive, mission critical applications at scale.

Cloud Spanner in a minute

14. Cloud SQL in a minute

Cloud SQL is a fully-managed database service that helps you set up, maintain, manage, and administer your relational databases on Google Cloud. In this video you’ll learn how Cloud SQL can help you with time-consuming tasks such as patches updates, replicas, and backups so you can focus on designing your application.

Cloud SQL is a fully-managed database service that helps you set up, maintain, manage, and administer your relational databases on Google Cloud. In this video you'll learn how Cloud SQL can help you with time-consuming tasks such as patches updates, replicas, and backups so you can focus on designing your application.

Cloud SQL in a minute

15. Memorystore in a minute

Memorystore is a fully managed and highly available in-memory service for Google Cloud applications. This tool can automate complex tasks, while providing top-notch security by integrating IAM protocols without increasing latency. Watch to learn what Memorystore is and what it can do to help in your developer projects.

Memorystore is a fully managed and highly available in-memory service for Google Cloud applications. This tool can automate complex tasks, while providing top-notch security by integrating IAM protocols without increasing latency. Watch to learn what Memorystore is and what it can do to help in your developer projects.

Memorystore in a minute

16. Bigtable in a minute

Cloud Bigtable is a fully managed, scalable NoSQL database service for large analytical and operational workloads. In this video, you’ll learn what Bigtable is and how this key-value store supports high read and write throughput, while maintaining low latency.

Cloud Bigtable is a fully managed, scalable NoSQL database service for large analytical and operational workloads. In this video, you'll learn what Bigtable is and how this key-value store supports high read and write throughput, while maintaining low latency.

Bigtable in a minute

17. BigQuery ML in a minute 

BigQuery ML lets you create and execute machine learning models in BigQuery by using standard SQL queries. In this video, learn how you can use BigQuery ML for your machine learning projects.

Learn how BigQuery ML enables users to create and execute machine learning models in BigQuery by using standard SQL queries.

BigQuery ML in a minute

18. Dataflow in a minute

Dataflow is a fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing. In this video, learn how it can be used to deploy batch and streaming data processing pipelines.

Dataflow is a fully managed streaming analytics service that minimizes latency, processing time, and cost through autoscaling and batch processing. In this video, learn how it can be used to deploy batch and streaming data processing pipelines.

Dataflow in a minute

19. Cloud Pub/Sub in a minute

Cloud Pub/Sub is an asynchronous messaging service that decouples services that produce events from services that process events. In this video, you’ll learn how you can use it for message storage, real-time message delivery, and much more, while still providing consistent performance at scale and high availability.

Cloud Pub/Sub is an asynchronous messaging service that decouples services that produce events from services that process events. In this video, learn how you can use Pub/Sub as messaging-oriented middleware or event ingestion and delivery for streaming analytics pipelines.

Cloud Pub/Sub in a minute

20. Dataproc in a minute

Dataproc is a managed service that lets you take advantage of open source data tools like Apache Spark, Flink and Presto for batch processing, SQL, streaming, and machine learning. In this video, you’ll learn what Dataproc is and how you can use it to simplify data and analytics processing.

Dataproc is a managed service that lets you take advantage of open source data tools like Apache Spark, Flink and Presto for batch processing, SQL, streaming, and machine learning. In this video, learn what Dataproc is and how you can use it to simplify data and analytics processing.

Dataproc in a minute

21. Data Fusion in a minute

Cloud Data Fusion is a fully managed, cloud-native, enterprise data integration service for quickly building and managing data pipelines. In this video, you’ll learn how Cloud Data Fusion can help you build smarter data marts, data lakes, and data warehouses.

Cloud Data Fusion is a fully managed, cloud-native, enterprise data integration service for quickly building and managing data pipelines. In this video, learn how Cloud Data Fusion can help you build smarter data marts, data lakes, and data warehouses.

Data Fusion in a minute

Read More

#AndroidDevJourney spotlight – February edition

Luli Perkins, Developer Relations Program Manager

Android Dev Journey February Header

Our second edition of #AndroidDevJourney is here! At the beginning of this year we launched the #AndroidDevJourney to share the stories of members of our community through our social platforms. Each Saturday, from January through June, we’ll feature a new developer on our Twitter account.

For a chance to be featured in our March spotlight series, tweet us your story using #AndroidDevJourney.

Andrew Kelly

Andrew Kelly

Tell me about your journey to becoming an Android Developer and how you got started.

In 2012 I was working as a contractor for the NSW government here in Australia as a Java J2EE web developer. I’d been in that role for 11 years, building web apps for students and teachers. However, in 2012 the government decided that contractors were expensive and let us all go. So while in my hand-over period I’d read about some kids who were writing Android apps and making lots of money doing so. The Android Market was new and so any app uploaded got a large audience, and since I already knew Java it seemed like switching from a web developer to a mobile developer might be a smart career move. So I purchased a new phone, the HTC Legend and spent the next 2 weeks learning everything I could about Android apps. It was the first time I could run software I’d written on a device made by someone else that I could carry around with me. It was a very exciting time where any app idea seemed possible.

When my contract finally ended, I managed to get a new job working for a mobile development agency and started working on Android apps for their clients. In order to learn more about Android app development, I started to attend the local Android meetups and Google Developer Group events, listening to speakers (mostly from Google) and trying to improve my skills as an Android developer.

In 2013 I was offered the opportunity to become the organiser of the Sydney GDG and it was that year that I also attended my first Google I/O (I’ve been every year since). One of the hard parts about being a GDG organiser is finding speakers, so occasionally if there were no speakers, or if a speaker dropped out at the last minute, I would step in and give a talk instead. 2013 was also the year I decided to move on from the mobile agency I was working at, and I spent the next 5 years working as a freelance contractor, working with clients such as eBay, the Sydney Opera House, and one of the large banks in Australia. Being the organiser of GDG Sydney and a regular speaker at the meetups meant finding work was quite easy.

In 2016 because of all the speaking I was doing I was approached to join the Google Developer Experts program, at this time I was doing regular talks at both the GDG Sydney and Android meetup events every couple of months. When I joined the GDE program, I handed over my GDG responsibilities to some friends, who still run it to this day. As part of the GDE program I’ve been lucky enough to attend many Google I/O events, and I’ve also had the opportunity to speak at conferences all over the world, including DroidCon Boston, Mobile Era in Oslo, DevFest Melbourne, DroidCon Singapore, Chicago Roboto and many others. Having the chance to speak to so many people all over the world has been very rewarding, and I’ve made many friends.

In 2019 I joined the company where I work today – mx51, I’m the lead Android developer designing and building apps that run on payment terminals, which also integrate with Point of Sales systems. I’m still a GDE but with the 2020 madness the ability to speak at in-person events was severely hindered. I hope that in-person events will start again soon and that I can continue my journey as a GDE.

What’s one shortcut, tip, or hack you can’t live without?

Android development is constantly changing and advancing, so there is always something new to learn. My tip would be to always be learning, there are lots of ways to do this, subscribe to the Android Developers YouTube channels and Medium publications. Follow Googlers and Google Developer Experts on Twitter for new tips and posts. Subscribe to the Android Weekly newsletter for an overview of new libraries and blog posts, and attend your local GDG chapter and Meetups. Not only are these great ways to learn new aspects of Android development, but with meetups they’re a great place to meet other Android developers, share successes, and ask for advice on problems.

What’s the one piece of advice you wish someone would have given you when you started on your journey?

When I started out as an Android developer, I could never have dreamed about being a Google Developer Expert, travelling the world and speaking at large events. It took me a long time to learn that it’s ok not to know the answers to people’s questions. If at an event someone asks something you don’t know, it’s ok to say so. You can always say that you’ll find out later and get back to them. There is no need to make up a wrong answer on the spot and lead someone off course. People are often scared that a topic they’re presenting might not be the best or greatest way to do something, and they fear looking stupid. If a person in the audience suggests a better way that shouldn’t be a worry, 1) you learnt something, 2) everyone else learnt something and 3) there may be scenarios where your solution is better and a discussion can be had. So my advice would be, when speaking don’t fear questions but embrace the opportunity to help someone immediately, or later, or perhaps discover something new yourself.

Amanda Hinchman-Dominguez

Amanda Hinchman-Dominguez

Tell me about your journey to becoming an Android Developer and how you got started.

I dabbled in Android development in college with the student mobile development group, but it wasn’t until I was a few years in web development I made the real switch over. Back in my web dev days, I joined the Kotlin community, where I felt immediately welcome. Shortly after, I moved to Chicago a few years back when I heard there was a Kotlin community in the tech scene.

Getting up to speed with Android at a professional level is a whole different game, and I’ve been lucky to find the overlapping Kotlin/Android community both locally and globally. Android development has accelerated my career technically and professionally, yet the world is so deep and vast within the sandbox of Android development.

Already being an active enthusiast with Kotlin, it only felt natural to switch to Android, and I’ve never looked back. Since then, I’ve been working scalable and complex Android applications, and contributing with some technical writing along the way. I’m currently co-writing with my colleague, Pierre Laurence, on “Programming Android with Kotlin: Achieving Structured Concurrency with Coroutines with O’Reilly”, and I’m excited to have it come out sometime this year.

What’s one shortcut, tip, or hack you can’t live without?

For larger projects, it’s sometimes hard to locate the file you’re looking at in your Project view. You can use the target symbol ⊕ to get a highlight the file you’re currently on in Android Studio.

Android Studio interface with arrows pointing to target symbol

What’s the one piece of advice you wish someone would have given you when you started on your journey?

Only install LeakCanary when, and only when, you and your team is ready for that conversation 😁

Anthony Edwards JR

Anthony Edwards Jr

Tell me about your journey to becoming an Android Developer and how you got started.

My journey as a developer started as a child. As a kid, I was obsessed with robots. I remember my dad bought me a Lego set called Lego Mindstorm, which was basically a robotics set with sensors and motors, plus it was also programmable. After graduating high school, I enrolled in the US Army as an Aviation Maintenance Repairer. After 6 years, I was honorably discharged then enrolled in college at Fordham University. In 2014, I received a Bachelor of Science in Computer Science. About 2 years later, I met my now wife, and together we started building EatOkra as a way for us to find black-owned restaurants in Brooklyn, NY. As we introduced the application to new people, they shared it with their network; before we knew it, many people were asking us to cover more areas in the south.

What’s one shortcut, tip, or hack you can’t live without?

Learn how to ask the right questions.

What’s the one piece of advice you wish someone would have given you when you started on your journey?

One piece of advice I wish I took more seriously was to not build an application using beta technology. EatOkra’s MVP was created using a beta version of a software framework. It started out good but then as they made updates, at times, I ended up having to wait months for certain issues to get fixed. I also had to completely stop and restart the app with an entirely new code base because the company decided to change how they architected the code. I learned a lot but it was painful to navigate.

Dinorah Tovar

Dinorah Tovar

Tell me about your journey to becoming an Android Developer and how you got started.

My journey started a couple of years ago (I was still in college) when I saw the Android Developer Udacity course. There was no nano degree back in the day. So once I saw it, I started building some apps for myself. From there, I applied for my first job as a junior developer in a big consulting firm. Then I started seeing more courses and started following a lot of people at Twitter, like Sam Edwards and Joe Birch (both GDE). The community made me grow and learn. A couple of years later I got my first team and I began delivering speeches at conferences and keeping up my Medium blog on the side. The community offers me feedback and knowledge, and especially a place to learn. My first conference was with WomenWhoCode.org here in Mexico. They opened a place for me without any experience. The same happened with Google Developers Groups here in Mexico City.

I became a Lead Engineer during my second job and I began doing worldwide conferences. I asked for feedback from Sam Edwards and Carlos Muñoz (also GDEs in Colombia) and they told me not to worry because I would amazingly and they encouraged me to keep doing it.

I got a really nice offer to start from scratch here as a Mobile Platform engineer in Mexico City with a huge fintech Startup (Konfio.mx). This is my current job, which means I am in the architectural office where we choose new ideas and new processes and pretty much service all the areas in the company.

I started creating a group of series to teach people some specific topics that I noticed were not deeply addressed. I also started getting involved in Kotlin Multiplatform and then I was reached out to by two GDE that nominated me to become GDE, Walmyr Carvalho, and Sam Edwards. They offered me feedback about my latest talks, podcast, and series and I was accepted at the end of 2020. Right now, I’m trying to learn more and deliver more talks and blog posts to the community.

What’s one shortcut, tip, or hack you can’t live without?

My special hack as an Android Developer is to use Wireless Debugging in the lastest Android Studio for physical devices. It is my favorite part because I don’t need to use any cables and the setup is super easy!

What’s the one piece of advice you wish someone would have given you when you started on your journey?

Optimizing your bucket options in Cloud Storage

You’ve got the data, and we’ve got the buckets. In this post, we’ll review why buckets are the cornerstone of everything you’ll do in Cloud Storage, and go over some of the options you’ll need to understand to get things set up. 

Why Buckets?

Anything you want to store in Cloud Storage needs to be in a bucket in order for you to do anything with it. You can use buckets to organize  and control access to your data, but unlike directories and folders, you cannot nest buckets. Additionally, there are limits to bucket creation and deletion, so we recommend that you design your storage applications to favor intensive object operations and relatively few bucket operations (more on that here).

With that said, there are some initial configuration considerations to take into account when creating your bucket: Name, Location, and Storage Class. We’ll take a closer look at each of these, and even link some documentation. Let’s get to it!

What’s in a Name?

First, your bucket needs a globally-unique name. Additionally, note that this name can’t be changed* (*unless you’re looking to copy and rename, in which case, here’s documentation on “workarounds”). 

Our best advice is choose a name that will be relevant and useful to you; Something that will help you remember what’s in it, why, and what you’re doing with it. As for specific characters, limitations, and other guidelines, I’ll leave that to the documentation on bucket naming, as it’s fairly comprehensive.

Location! Location! Location!

Once that bucket has a name, you’ll need to select a geographic location. General guidance here states that you should choose location based upon what type of redundancy options you need, where your primary users are, and what your expected first-time-to-byte is when caching is turned off. This is because a good bucket location choice will balance latency, availability, and bandwidth costs for data consumers. We’ve got plenty of options for you to choose from, so check out the documentation for a location listing.

World Buckets

Once you’ve settled on a geographic location, you’ll need to select from 3 location types—and we’ve got region, dual-region, and multi-region to give you plenty of flexibility in choosing the location that will work best for you and your users. Please note that location also can’t be changed once the bucket is created.

As for the types:

Use a region to help optimize latency and network bandwidth for data consumers, such as analytics pipelines, that are grouped in the same region.

Use a dual-region when you want similar performance advantages as regions, but also want the higher availability that comes with being geo-redundant.

Use a multi-region when you want to serve content to data consumers that are distributed across large geographic areas, or when you want the advantages as geo-redundancy.

For more specific information, including available locations and pricing, check out the documentation. 

Storage Class

And finally, you should choose a storage class, which you can update later on, and will default to Standard if you don’t initially select something more specific (although you can change the default storage class, if needed—details here). For now, let’s go over the basics.

Cloud Storage has 4 different storage classes which all offer low latency and high durability, but  vary based on their availability and minimum storage duration, along with pricing for storage and access. 

Storage Classes

Data that will be served at a high rate with high availability should use the Standard Storage class. This class provides the best availability with the trade-off of a higher price.

Data that will be infrequently accessed and can tolerate slightly lower availability can be stored using the Nearline Storage, Coldline Storage, or Archive Storage class. Your choices here are going to vary depending on your specific needs, and cost considerations; you wouldn’t want to pay to have something stored for all-the-time access when you only need it biannually.

I like to think about using Nearline for something I’ll need to access once a month, and Archive for something I’ll need once a year, and Coldline for the stuff in between. The documentation will help you make the best choice, and provide additional pricing information.

Post-Bucket Creation Goodness

Now that you’ve got your buckets configured, we’ll be able to review the up(loads) and down(loads) of data in Cloud Storage, so stay tuned!

Learn more about your storage options in Cloud Storage Bytes, or check out the documentationfor more information, including tutorials, on creating buckets.

Would you rather listen than read? Check out our new tech blog podcast.

Related Article

Put your archive data on ice with new storage offering

The new storage class called Archive, our coldest Cloud Storage offering yet, is now available for data backup and storage.

Read Article

Read More

Google Workspace Updates Weekly Recap – February 26, 2021

New updates

There are no new updates to share this week. Please see below for a recap of published announcements.

Previous announcements

The announcements below were published on the Workspace Updates blog earlier this week. Please refer to the original blog posts for complete details.
Improvements for locating new comments and important conversations in Google Docs
We’ve added two new ways that make it easier to find comments that require your attention and action in Google Docs on the web. New comment activity since the last time you viewed a document will be “badged” with a blue dot. Additionally, when you hover over the blue dot, you’ll see a “New” banner. | Learn more.
More options for sharing your availability in Google Calendar
We’re adding two new options in Calendar, which will help you better communicate your work availability to your colleagues. Specifically, you can create repeating out of office entries and split your working hours into multiple segments each day. | Learn more.
End a Google Meet video call for everyone at once
When a Google Workspace for Education Fundamentals or Education Plus host leaves their meeting, they can now choose to keep others on the call or to end the call instead, ejecting everyone else. | Available to Google Workspace for Education Fundamentals and Education Plus customers only. | Learn more.
Easily locate the source file for embedded Drive video and audio files in Google Slides
It’s now easier to find the original source file for Google Drive-stored video or audio files embedded in a Google Slides presentation. | Learn more.
Reminder: Ending support for IE11 for all Google Workspace apps on March 15
Last year, we announced that Google Workspace will officially stop supporting Internet Explorer 11 (IE11) on March 15, 2021. To avoid any possible disruptions in service, such as degraded performance or security vulnerabilities, please be sure to switch to a different browser before that date. | Learn more.
Let Google Calendar automatically book a replacement room for your events
If a room declines your event, Google Calendar can now find a similar room to replace it, automatically. | Available to Google Workspace Essentials, Enterprise Standard, Enterprise Plus, Education Fundamentals, and Education Plus, as well as G Suite Basic, Business, and Nonprofits customers only. | Learn more.
Admin control for AppSheet now fully available
In December 2020, we announced the availability of an admin control for AppSheet in the Additional services section of the Admin console. This rollout of this control is now complete. | Learn more.
For a recap of announcements in the past six months, check out What’s new in Google Workspace (recent releases).

 


Read More

With SRE, failing to plan is planning to fail

People sometimes think that implementing Site Reliability Engineering (or DevOps for that matter) will magically make everything better. Just sprinkle a little bit of SRE fairy dust on your organization and your services will be more reliable, more profitable, and your IT, product and engineering teams will be happy.

It’s easy to see why people think this way. Some of the world’s most reliable and scalable services run with the help of an SRE team, Google being the prime example. 

For almost two decades, I’ve lived and breathed running production systems at large scale. I had to think about tradeoffs, reliability, costs, implementing a variety of architectures with different constraints and requirements—all while getting paged in the middle of the night. More recently, I’ve had the privilege to leverage that experience and knowledge to help Google Cloud customers modernize their infrastructure and applications, including implementing an SRE practice. While these learnings look different from organization to organization, there are common lessons learned that will impact the success of your deployment.

When problems do arise, it’s usually not because of technical challenges. A stalled SRE culture is usually a business process failure—goals weren’t properly defined up front and stakeholders weren’t properly engaged. After watching this play out repeatedly, I’ve developed some advice for technology leaders about how to implement a successful SRE practice. 

Before you start

Your SRE journey should start well before you read your first manual, or put in your first call to an SRE advisor. As a technology leader within your organization, your first job is to answer a few key questions and gather some basic facts. 

What problem are you trying to solve? 

Most organizations will readily admit they’re not perfect. Perhaps you need to reduce toil, be more innovative, or release software faster. SRE, as a framework for operating large scale systems reliably, can certainly help with those goals. To do that, it’s important to understand your motivations and what gaps or needs exist in your organization.

Ask yourself what the organization is trying to achieve from the transformation. What worries the organization about reliability? For SRE to be successful and efficient, it is crucial to start with the pain. Starting by identifying what you are trying to solve will not just help you solve it; it will help your organization be more focused, align the relevant parties to a common goal, and make it easier to gain decision-makers’ buy-in (and much more).

Once you understand the problem you are trying to solve, you need to know when you have “solved” it (e.g. how you will define success). Setting goals is critical—otherwise, how will you know if you have improved? We’ll discuss how to set up metrics to help in this self-evaluation in a later post.

Who are the key decision-makers in the organization?

Even though implementing SRE principles involves engineering at its core, it’s actually more of a transformation process than a technological challenge. As such, it will likely require procedural and cultural changes.

As with any business transformation, you need to identify the relevant decision-makers up front. Who those people are depends on the organization, but it usually includes stakeholders from product, operations, and engineering leadership, though these can be named differently in various organizations and can even be separated under multiple organizations. Identifying those decision makers can be especially difficult in a siloed organization. It is important to take the time and reach out to different groups to identify the key stakeholders and influencers (it will save you a lot of time later on). Make sure that you are throwing a wide enough net. It is important to get input from different groups with different requirements (e.g., security).  

At the same time, try to be flexible. It’s okay if your list of decision makers gets updated and fine-tuned during the process. Like in other engineering domains, the goal is to start simple and iterate. 

Get buy-in and build trust

Once you’ve identified the relevant decision makers, make sure you have support from your colleagues, and the rest of the organization’s leaders. Creating an empowered culture is critical for implementing the core principles of SRE: a learning culture that accepts failures, that facilitates blamelessness and creates psychological safety, all while prioritizing gradual changes and automation.

From my experience, you cannot drive real change in an organization without widespread support and buy-in from leadership and decision-makers—and that’s especially true for SRE.  Implementing SRE, similar to DevOps, requires collaboration between different functions in the organization (product, operations and development). In most organizations, those functions fall under separate leadership chains, each with its own processes. If you’re going to align those goals and procedures, leadership needs to prioritize the change. At the same time, driving cultural change from the bottom up can be more challenging and take longer than top-down mandates, and in some cultures will be impossible. In short, leading by example and enabling the people in the organization are critical for driving change and fostering the ‘right’ culture.

Remember: it’s a marathon, not a sprint 

The journey to SRE combines several challenges, both from technical and human (culture, process, extra) perspectives, and those are intertwined. To be successful, leadership needs to prioritize organizational changes, allocating resources for engineering excellence (quality and reliability) and fostering cultural principles like reducing silos, blamelessness and accepting failure as normal.

Align expectations! All parties involved in an SRE implementation—from product and engineering to leadership—will need to recognize that change takes time and effort, and in the short term—resources. Daunting as it may be, SRE’s goal is to solve hard problems and build for a better tomorrow. 

Interested in getting deeper with SRE principles? Check out this Coursera course for leaders, Developing a Google SRE Culture. And stay tuned for my next post, where I outline some tactical considerations for teams that are early on their SRE journey, from identifying the right teams to start with, enablement and building community.

Read More

Troubleshooting services on Google Kubernetes Engine by example

Applications fail. Containers crash. It’s a fact of life that SRE and DevOps teams know all too well. To help navigate life’s hiccups, we’ve previously shared how to debug applications running on Google Kubernetes Engine (GKE). We’ve also updated the GKE dashboard with new easier-to-use troubleshooting flows. Today, we go one step further and show you how you can use these flows to quickly find and resolve issues in your applications and infrastructure. 

In this blog, we’ll walk through deploying a sample app to your cluster and configuring an alerting policy that will notify you if there are any container restarts observed. From there, we’ll trigger the alert and explore how the new GKE dashboard makes it easy to identify the issue and determine exactly what’s going on with your workload or infrastructure that may be causing it.

Setting up

Deploy the app
This example uses a demo app that exposes two endpoints: an endpoint at /, which is just a “hello world”, and a /crashme endpoint, which uses Go’s os.Exit(1) to terminate the process. To deploy the app in your own cluster, create a container image using Cloud Build and deploy it to GKE. Then, expose the service with a load balancer. 

Once the service is deployed, check the running pods:

Notice that RESTARTS is initially at zero for each pod. Use a browser or a command line tool like curl to access the /crashme endpoint. At this point, you should see a restart:

Each request to that endpoint will result in a restart. However, be careful to not do this more often than every 30 seconds or so, otherwise, the containers will go into CrashLoopBackOff, and it will take time for the service to be available again. You can use this simple shell script to trigger restarts when as needed:

where $IP_ADDRESS is the IP address of the load balancer you’ve already created. 

Why do container restarts matter? Well, restarts, to a certain degree, are an expected part of a container’s typical lifecycle in Kubernetes. Too many container restarts, however, could affect the availability of your service, especially when expanded over a larger number of replicas for a given Pod. Not only do excessive restarts degrade the service in question, but they also risks affecting other services downstream that use it as a dependency.

In real life,the culprit for a large number of restarts could be a poorly designed liveness probe, issues like deadlocks in the application itself, or misconfigured memory requests that result in OOMkilled errors. So, it is important for you to proactively alert on container restarts to preempt potential degradation that can cascade across multiple services. 

Configure the alert
Now, you’re ready to configure the alert that will notify you when restarts are detected. Here’s how to set up your alerting policy:

1 Configure the alert.jpg

You can use the kubernetes.io/container/restart_count metric, filtered to the specific container name (as specified in the deployment yaml file). Configure the alert to trigger if any timeseries exceeded zero—meaning if any container restarts are observed. 

With the setup done, you are ready to test and see what happens!

Testing the alert

When you’re ready, start the looped script to hit the /crashme endpoint every 45 seconds. The restart_count metric is sampled every 60 seconds, so it shouldn’t take very long for an alert to show up on the dashboard:

2 Testing the alert.jpg

You can mouse-over the incident to get more information about it:

3 incident.jpg

Then click on “View Incident”. This takes you to the Incident details screen, where you can see the specific resources that triggered it—in this case, the incident is generated by the container.

4 View Incident.jpg

Next, you can click on View Logs to see the logs (in the new Logs Viewer!)—it’s immediately apparently that the alert is triggered by the containers restarting:

5 View Logs.jpg

This is all very nicely tied together and makes troubleshooting during an incident much easier!

In summary….

The latest GKE dashboard includes many improvements over previous iterations. The new alerts timeline is intuitive, and incidents are clearly marked so that you can interact with them to get the full details of exactly what happened, all the way down to the container logs that tell you the actual problem.

As an oncall SRE or DevOps engineer for a service running on GKE, the GKE dashboard makes it easier for you to respond to incidents. You’re now able to go from an incident all the way to debug logs quickly and easily and reduce the time it takes to triage and mitigate incidents. For a short overview on how to troubleshoot services on GKE, check out this video:

In previous episodes, we’ve shown you how to set up monitoring and alerting for your GKE services, but what do you do when an alert fires? In this episode of Stack Doctor, we show you how to use the alerts timeline on your GKE monitoring dashboards to troubleshoot your services. Watch to learn how you can easily spot and resolve issues in your applications and infrastructure!

A special thanks to Anthony Bushong, Specialist Customer Engineer, for his contributions to this blog post.

Related Article

Shrinking the time to mitigate production incidents—CRE life lessons

See how you can use SRE and CRE principles and tests from Google, including Wheel of Misfortune and DiRT, to reduce the time needed to mi…

Read Article

Read More

Cash App uses Google Cloud to power mobile payments innovation and research

Mobile payments are creating opportunities to reach and benefit more people worldwide by providing services to underbanked communities, and empowering streamlined services in e-commerce and brick-and-mortar stores.

Square, a U.S.-based financial services company that specializes in payment software and hardware products, currently stands at the cutting-edge of this industry.

The company regularly pursues more powerful, advanced financial services solutions, like its burgeoning consumer finance service Cash App. Cash App has been a particularly active source of innovation. Last year, Square acquired an artificial intelligence (AI) research firm Dessa to bolster Cash App’s existing solutions and drive innovative new mechanisms to improve customer experience and increase accessibility to banking services.

CashApp opted to use Google Cloud AI and machine learning (ML) solutions and NVIDIA’s graphics processing units (GPUs) to handle the immense compute demands of its applied AI efforts.

Establishing a foundation for breakthroughs in Artificial Intelligence

Dessa has a long history of applying AI to what it calls “Bananas” – novel and ambitious projects that use emerging machine learning technologies to solve problems in new ways, ultimately driving real-world impact. 

Dessa has used Google Cloud’s AI Platform services which were configured and made available by Square’s Platform Infrastructure Engineering group to Square’s internal needs. The services enable data scientists at Square to carry out these data-heavy, processing-intensive initiatives. Dessa works with Deep Neural Networks (DNNs), which come with long training times and data volume requirements that can make new experimentation and ideation challenging. DNNs are more resource intensive, but help to solve many of the computer training problems that AI/ML practitioners sometimes face. 

While Cloud Storage helped to alleviate some of the challenges associated with storage of raw and analytical data, the speed with which information could run through and between GPUs was also a sticking point. 

“Google Cloud gave us critical control over our processes,” said Kyle De Freitas, a senior software engineer at Dessa. “We recognized that Compute Engine A2 VMs, powered by the NVIDIA A100 Tensor Core GPUs, could dramatically reduce processing times and allow us to experiment much faster. Running NVIDIA A100 GPUs on Google Cloud’s AI Platform gives us the foundation we need to continue innovating and turning ideas into impactful realities for our customers.”

NVIDIA stepped in to identify bottlenecks in these processes and implement the A100 to experiment with large datasets and push out new models more quickly. The NVIDIA A100 GPU delivers 20X more compute capacity than the previous generation, along with a new TF32 precision, Multi-Instance GPU (MIG) feature and support for accelerating structural sparsity.

Google Cloud AI and NVIDIA were able to deliver a roughly 66% improvement to the processing time it takes to complete a core ML processing workflow. 

NVIDIA has also provided Dessa with developer support to improve ML engineer skills, remove bottlenecks, and overcome challenges in real time. NVIDIA developer support, GPUs, and AI Platform on Google Cloud have also improved the speed and quality of Cash App services to customers.

For example, Dessa would generally need about six hours to process two terabytes of data and complete training for a single epoch, or the total passes of a dataset that an ML algorithm has completed. Now, it can complete processing seven terabytes of data in under two hours. Considering the fact that Dessa runs between 10 and 20 epochs at a time, some of which involve training with 350 million parameters, this 10X acceleration delivered by the NVIDIA A100 has proven invaluable.

“NVIDIA GPUs and AI Platform have given us value by scaling up to deal with data and the volume associated with it, while Dataflow gives us the speed to capitalize on event data in real-time,” said De Freitas.

Further embedding AI into Cash App

Because Cash App has put so much effort into maturing its AI/ML capabilities, it is now better positioned to effect real change in the communities it serves.

“We’re focused on providing financial support across communities, like the ability to share resources in a secure, inclusive, and traceable manner through advanced ML technologies,” said Da Freitas. 

Through Dessa’s experimentations and innovations, Cash App and Square are furthering efforts to create more personalized services and smart tools that allow the general population to make better financial decisions through AI.

Related Article

Your ML workloads cheaper and faster with the latest GPUs

NVIDIA T4s, P100s, V100s can reduce costs and increase throughput compared to K80s.

Read Article

Read More

Increasing limits for three key Cloud Monitoring features

Cloud Monitoring is one of the easiest ways you can gain visibility into the performance, availability, and health of your applications and infrastructure. Today, we’re excited to announce the lifting of three limits within Cloud Monitoring.

First, the maximum number of projects that you can view together is now 375 (up from 100). Customers with 375 or fewer projects can view all their metrics at once, by putting all their projects within a single workspace.

cloud monitoring.jpg

Second, the maximum number of custom metric descriptors within custom.googleapis.com is now 2,000 (up from 500). Customers approaching 500 metrics will no longer have to juggle old metrics in order to start sending new metrics.

Third, all metrics sent by Compute Engine Agents are now retained for 24 months (up from 6 weeks) for free. Customers will now be able to perform long-term trend analysis, such as tracking memory growth year-on-year for capacity planning purposes.

We look forward to lifting even more limits throughout 2021–stay tuned!

Related Article

Three ways tight integration makes logging and monitoring easier

How is GCP is better than Azure with regard to ease of use? A major differentiator from a recent blog was how Logging and Monitoring “jus…

Read Article

Read More

Providing better product information for shoppers

To best help users find your content and products in Search, we recommend that you clearly identify products mentioned.


Read More

Black History Month: Celebrating the success of Black founders with Google Cloud: Switchboard Live

February is Black History Month —a time for us to come together to celebrate and remember the important people and history of the African heritage. Throughout this month we have highlighted three Black-led startups and how they use Google Cloud to grow their businesses. Our final feature highlights Switchboard Live and its founder, Rudy. Specifically, Rudy talks about how the team was able to innovate quickly with easy to use Google Cloud tools and services. 

rudy headshot

Nearly our whole lives have moved to virtual spaces as a result of the pandemic. The way we interacted with the events, our loved ones and the world changed seemingly overnight, and many of us are still adjusting to these new times. 

My startup, Switchboard Live, brings people together virtually in meaningful ways. The pandemic provided us an opportunity to help our customers build meaningful connections between people at a time when we are required to stay socially distant. 

What is Switchboard?

Switchboard is a SaaS platform aimed at making it easy for you to share live streams, such as webinars, conferences, events, and panels to multiple social channels/platforms which increases viewership by reaching more viewers. Features like quick setup, embeddable player, and StreamShare work together in building increased audiences. 

For example, with our StreamShare feature, not only do you get to share, but the audience also can share your livestream with others. In short, it helps you expand your reach considerably. And once you are done, you can have the stream stored on your website for others to watch at a later date. You can also embed your live stream on your website for simple and easy access. All these features work together to allow our customers to reach more viewers, approximately doubling their viewership – growing audiences, no matter where they originated. 

sbl logo

Making it faster and easier to connect 

Although many livestream options have popped up in response to COVID-19, I founded Switchboard Live in 2016 with a goal of making it faster and easier for content creators and brands to reach a bigger audience. Before creating my company, I worked for 15 years in the streaming industry. I always noticed how difficult it was to manually stream events across different platforms and workflows with little to no interruptions. The valuable experience I gained from manually working through the technical aspects and processes of streaming early in my career, illuminated the highly-scalable potential of a technology like Switchboard Live.

While initially, Switchboard Live focused on providing houses of worship a way to connect remote congregations together, we quickly realized there was a massive need for this technology across industries. We quickly expanded into industries such as sports and media, education, government, nonprofits, media and entertainment, technology, and digital creators. 

Switchboard Live – Moving from Azure to Google Cloud 

Switchboard started by using Google Cloud for small workloads, with Azure as our primary provider. In particular, we used Google App Engine to spin up additional resources when website traffic increased, to ensure our customers have a seamless experience with our platform. However, as we grew, we wanted a fully managed Kubernetes platform that was easy to manage and had an effortless developer experience. 

It’s not surprising that Google Kubernetes Engine (GKE) was the best fit given these requirements. Particularly given it is the most mature container orchestration service available today, providing industry-first capabilities such as release channels, multi-cluster support, unique 4-way auto scaling, along with node auto repair to help us improve availability. 

Likewise, Cloud Build offered a unique developer experience when it came to CI/CD. Since it was fully serverless, there was no need to pre-provision servers or pay in advance for additional capacity. Most importantly, we only have to pay for what we use. And finally given deep integration of Cloud Ops with GKE, actionable insights are just a click away. When we detect an anomaly in a dashboard or when an alert is fired, there is no need to log into another portal or console to start troubleshooting and remediating. Everything is available in-context with deep links to logging and monitoring

With the help of the Google for Startups Black Founders Fundwe will soon be expanding to more Cloud tools such as Cloud DNS andCloud SQL. These will help reduce our latency to almost zero.

Google for Startups: Black Founders Fund – Streaming before it was cool

Google for Startups funding and continued mentorship have been instrumental as our signups were 4x what we had seen in previous months. And with the additional funding we were able to hire three additional full time employees to keep up with the influx of customers we acquired during the pandemic. Receiving credits for Google Ads and Google Cloud have been helpful, but having someone there to talk us through the best way to use our credits was the real benefit. Furthermore, leveraging Google employees to receive additional guidance in areas of marketing and sales empowered us to look beyond our initial goals and reach even higher to meet our customer needs. 

We were even able to receive mentorship from our fellow black-led startups in our cohort, on matters such as accounting, legal agreements, and partnerships. 

2019 cohort
2019 Google For Startups Cohort: Black Founders Event

Connecting with a community of Black founders influenced me dramatically. Being around other Black founders who looked and talked like me was extremely uplifting. In most cohorts for startup mentorship, I am rarely in the company of other Black-led startups, if any at all— a challenge that we discuss at length amongst our Black Founders Fund cohort. Google has paved the way for many Black founders, including myself, to share their innovative ideas and game-changing technology with a wider audience, ultimately empowering us to more readily serve our customers and communities. 

Thank you for all the Google mentors we have had throughout our beautiful journey. You have made it possible for us to reach so many people. What they are streaming is their joy, and we at Switchboard Live want to help spread that joy.

If you want to learn more about how Google Cloud can help your startup, visit our startup page here and sign up for our monthly startup newsletter to get a peek at our community activities, digital events, special offers, and more.

Read More