3.1: What are pre-trained models? with Английский - CC subtitles (closed captions) and transcript

Download Subtitles and Closed Captions (CC) from YouTube

Enter the URL of the YouTube video to download subtitles in many different formats and languages.

BilSub.com - bilingual subtitles >>>

3.1: What are pre-trained models? with Английский - CC subtitles Complain, DMCA

A- A+ close video open video

JASON MAYES: Now\nlet's dive deeper

These models have already\nbeen trained by someone else

so you don't need to\ngather your own data

or spend time and resources\ntraining them yourself.

Instead, you can load the\nmodel and use it directly

for the task it was trained\nfor within your own production

Now pre-trained models\nwithin the TensorFlow.js

Some, like the ones\nfor TensorFlow.js team

here at Google\nhave produced, are

wrapped in easy-to-use\nJavaScript classes

that you can use in\njust a few lines of code

and are available for\nmany common use cases.

These are great for people\nnew to machine learning

and can be used in minutes,\nand you'll learn more

Others require more knowledge\nof machine learning to use

as they come in their raw form\nwith no easy-to-use helper

functions wrapped\naround them, and you'll

be learning how\nto use these, too.

So here, you've got an example\nof a pre-trained model known

as BERT Q&A that can perform\nadvanced text search in the web

Using this model, you can\nfind an answer to a question

within any piece of\ntext you present to it.

Notice here how in\nthe demo, the question

uses words that are\nnot in the answer.

If you ask it, what are\nthe best stargazing days

it finds the answer\nreferring to the nights

during certain moon cycles,\neven though the days were not

This model can be used with\nany text and any question.

And here, it's shown running\nin a Chrome extension

so you can also use\nit on any web page.

Now this pre-trained model\nis actually one of many

that the TensorFlow.js\nteam have created and have

You may be wondering how hard it\nis to use something like this.

Well, using that set of\nofficial TensorFlow.js models

In fact, for [? core ?]\ncode for this one

fits on a single slide,\nso let's walk through it.

So first, you import for\nTensorFlow.js library

and then the pre-made\nmodel that you want to use.

Next, you can define the\ntext you wish to search.

This could be just\nsome text on a website

but here, I just\nuse a simple string.

You can then define the\nquestion the user wants to ask

which, of course, could come\nin some form of input box

Now you load the question\nand answer model itself.

As this takes time to\nload, it's performed

So you use the then keyword\nto wait for it to be ready.

And once the model is\navailable, a function

will be called, which is\npassed for loaded model

Finally, you can then\ncall model.findAnswers.

You pass to this\nfunction the question

you want to answer\nalong with the text you

Again, this is an\nasynchronous operation

as it might take a few\nmilliseconds to execute.

But once ready,\nthis promise will

resolve to return an answers\nobject, which you can then

iterate through to find the most\nlikely answer from the given

In this case, it would\npredict cats as the answer

to the question proposed,\nwhich is correct

given the text you had\nto search on this slide.

It's no different to\nwriting regular web apps.

Now since launch, the\nTensorFlow.js team

have released many\neasy-to-use pre-made models

and we're continually\nexpanding our selection, which

you'll hear more about shortly.

Models exist across\nmany categories

such as vision,\nbody, text, and sound

that you can use in just\na few lines of code

You can check out\ntensorflow.org/js/models to see

them all and to find the code\nsnippets that show you how

Even better, you do\nnot need a background

in machine learning\nto use these.

Just a working knowledge\nof JavaScript is required

but they are still\nvery powerful.

So let's take a look at\nsome of these in action.

And as I show you\neach one, try to think

about how you could use\nit to solve problems

that you or someone else\nmight actually have.

First up you have\nobject recognition.

Here you're able to run the\npopular COCO-SSD model live

in the browser to\nprovide bounding boxes

for 80 common objects the\nmodel has been trained on.

What this means is that\na rectangle or square

can be drawn that shows\nexactly where in the image

Now before I\ncontinue, you may have

noticed that some of\nthe names of models

are not particularly\nfriendly sounding

This is something\nyou'll get used to

and it should be noted\nthat in many cases

the name often originates from\nsome combination of the data

it's trained on, the machine\nlearning architecture

it uses behind the scenes, or\nthe utility that it provides.

As you get more familiar\nwith these things

these names become\nless mysterious.

COCO-SSD, for\nexample, was trained

on Microsoft's COCO\ndataset, which stands

This is a famous dataset that\ncontains hundreds of thousands

of images that were annotated\nby humans for typical things

you might see in\nyour daily lives.

Furthermore, this model\nuses an SSD architecture

which stands for\nSingle Shot Detector

and the scope of which is\nbeyond this introductory course.

But know that this\nis just describing

some of the inner workings\nof the model itself.

And as you can see from\nthe image on the right

this COCO-SSD model allows\nus to not only understand

where in the image\nthe object is located

but also, how many\nexist, which is much more

powerful than image\nrecognition that would tell us

that something exists\nin a given image

And that's the key difference\nbetween object recognition

So here, you can see\nCOCO-SSD running live

in a web browser\non a real web page.

If I click on any one of\nthese images at the top

you can see the classification\nis coming back in real time.

Now here's just a few\nexamples of the objects

it can recognize,\nand you can see

how you might use it for\nsomething useful, even right

On the image on\nthe left, you can

see that this dog is very\nclose to this bowl of treats.

And you can imagine that you\ncould detect this quite easily

and send yourself an\nalert when this occurs.

But of course, we can\ndo better than that.

We can enable our\nwebcam and now live

as I'm talking to you here\ntoday, if I scroll down

you can see it classifying\nme in real time, as well.

And as I move my\nhands around here

you can see the bounding\nbox expand and contract all

in real time at a high\nframes per second.

You can see here\nit's recognizing me

as a person with\nabout 86% confidence

Now what's really cool about\nthis is that all of this

is running live\nin my web browser

on the client side\nin JavaScript

meaning none of these\nimages are being

sent to the server\nfor classification.

And that protects my privacy\nas an end user, which

OK, let's head on\nto the next model.

Now you're not just\nlimited to using images.

Here you can use our\nsound recognition model

You can even retrain the model\nto recognize custom sounds

We even got models for\nunderstanding language.

Here, you can use our\ntext toxicity model

to automatically\ndiscover if some text is

potentially insulting,\nthreatening, or toxic.

Maybe you could hide\npotentially offensive things

as a page is rendered for a\nmore pleasant user experience.

Next is our Face\nMesh model, which

provides high resolution\nface tracking that's

just three megabytes in\nsize and can recognize

468 points on the human\nface across multiple faces

A number of companies are\nusing this with existing web

technologies, and a\ngreat example of this

is by Modi Face, who's part of\na L'Oreal group, that combines

face mesh with WebGL shaders\nfor augmented reality makeup

On the image on\nthe right it should

be noted that the lady is\nnot wearing any lipstick.

This is being augmented in\nreal time in the browser.

And then the user can select\ndifferent shades at will

to see what's best for\nthem without needing

to install an app or\neven walk into a store.

OK, so here you can see\nFace Mesh running live

On the left hand side, you\ncan see the machine learning

in action, rendering this\nnice mesh-like object

And you can even see where\nit thinks my irises are

And if I just scrunch\nmy face a little bit

you can see how well it updates.

So ah, and then I\nsqueeze my eyes

you can see that updating\nall in real time very nicely.

Now then, not only am I able\nto do the machine learning

on the left hand\nside here, I can also

render this 3D point cloud\non the right using three.js.

And this is one of the beautiful\nthings about JavaScript

is that not only am I able\nto do the machine learning

but there's also plenty of\nother very powerful libraries

out there for data\nvisualization or 3D graphics

as you see here, that you\ncan use in a matter of hours

and make something\nvery, very quickly.

Now, the keen eyed\namong you will

have noticed that my\nperformance right now

is around 20 to 25\nframes per second.

That's because I'm running on\nmy graphics card via WebGL here

and my graphics card\nis actually pretty old.

If I change this\nto WebAssembly, you

can see it's now going\nto execute on my CPU

and that shoots up to 30\nframes per second instead.

So you can change at\nwill what hardware

you want to execute on,\nand that's very powerful.

So with that, let's head\non to the next demo.

We also recently released two\nnew pose estimation models

in collaboration with\nresearch teams at Google.

The first, MoveNet, is an\nultra fast and accurate model

that tracks 17 key points\noptimized for diverse poses

and actions and can run at\nover 120 frames per second

on an NVIDIA 1070 GPU\nclient side in the browser.

The second, MediaPipe BlazePose,\ngives us 33 key points

and is also tailored for\na diverse set of poses.

This extra granularity,\nsuch as tracking both hands

could enable gesture-based\napplications that might

be useful for certain projects.

There is also now a 3D version\nof this model available too.

Now both models have higher\naccuracy and performance

over our original PoseNet\nimplementation that some of you

So we recommend you upgrade\nand try them both out

to see what works best\nfor your intended use case

if you're looking to use pose\nestimation in a future project.

If you'd like to instead\nfocus on the hands

you can do that using our\nhand pose tracking model.

As you can see, it can track\nup to 21 points in three

And with some extra\nlogic, you can

use this data to detect\ngestures, sign language

or even control user\ninterfaces in a touchless way

opening up a whole new world for\nhuman computer interaction use

Next, you've got\nbody segmentation.

This model enables segmentation\nof multiple human bodies

as you can see in the\nimage on the right.

Even better, some\nsegmentation models

also bring back\nthe pose, which you

can see by the light blue\nlines inside the bodies

This, particularly\nnamed Bodypix

can distinguish between\n24 different body

parts, represented by\nthe different colored

Now the premade\nmodels you just saw

allow you to create pretty much\nanything you might dream up.

So let's take a look\nat some real examples.

Here inSpace use real\ntime toxicity filters

You can see that when a\nuser types something bad

it's flagged before\nit's even sent

And it alerts the\nuser if they might

want to reconsider what\nthey're about to send

creating a more pleasant\nconversational experience

This is powered by\nour text toxicity

model that was pre-trained\non a dataset of over 2

Well, how about this\nIncludeHealth system that uses

pose estimation models to\nenable physiotherapy at scale.

With many folk unable\nto leave their homes

or travel remotely these\ndays, this technology

allows for a remote\ndiagnosis from the comfort

of their own home using\noff-the-shelf technology

such as a standard webcam, that\nmany people will have access

Well, how about enhancing\nthe capabilities

Here, I use a body segmentation\nmodel with some custom logic

to estimate my\nbody measurements

allowing the website\nto automatically select

the correct sized\nT-shirt at checkout.

Even better, this was\nmade in just two days

using our pre-made\nbody segmentation

model that you just saw\non the previous slides.

And with a bit of creativity,\nyou can take a model

add some custom code, and\nquite literally give yourself

This is more advanced than\nsimply replacing the background

For that, you wouldn't even need\nmachine learning, of course.

But notice here how\nwhen I go in the bed

the bed still deforms in\nthe image on the right

as I move around to give\nyou this ghostly effect

or how the laptop\nscreen still plays.

This prototype uses\nBodypix that you

saw to calculate where the body\nis not so it can eventually

learn all the background\nand then keep updating parts

And even better, this\nwas made in under one day

and runs entirely in the\nbrowser, meaning many people

could try it out globally,\neven without having

You simply click a\nlink and it just works.

No images are even sent to\nthe server for classification

Another member of the\ncommunity combined his love

for WebGL shaders with\na TensorFlow.js model

to enable him to shoot lasers\nfrom his eyes and mouth.

This actually uses\nthe Face Mesh model

you previously saw to run\nin real time in the browser

Now whilst this\nis a fun demo, you

can imagine using this\nfor a movie launch

to amplify the reach with a\ncreative experience for fans

By combining TensorFlow.js\nmodels with other emerging web

technologies, like WebRTC\nfor real time communication

or AFrame for mixed\nreality in the browser

or even Three.js\nfor 3D, you can now

create a digital\nteleportation of yourself

anywhere in the\nworld in real time.

Here, I can segment\nmyself in the bedroom

transmit my segmentation\nto save bandwidth

and then recreate\nmyself in the real world

Remember, all of this is\nrunning in a web browser.

No app install is\nrequired, leading

to a frictionless\nexperience for the end user.

Having tried this\nmyself, it really

feels more personal than\na regular video call

as you can walk up to the\nperson and hear the audio

Maybe next time I'm\npresenting to you

I'll be able to do so in\nyour own room like this

as if I was standing\nright in front of you.

And you saw it here\nfirst, of course.

Now everything you\njust saw was created

using a pre-made\noff-the-shelf model that

typically can be used in\njust a few lines of code.

My point for showing you\nall of these examples

is that with a little\nbit of creativity

and by leveraging your existing\nweb engineering skills

you can use many of\nthe pre-trained models

like the ones you just saw,\nfor pretty much any industry

out there, providing your\ncustomers with new features

that were previously impossible\nto achieve within the same time

So keep this in mind as you\nlearn more in this course.

Think about how\nyou can relate what

you learn so that it can be\ncombined with your existing web

engineering skills to\nproduce something new.

And with that, it's time to try\nsome of these out for yourself.

Choose three of the pre-trained\nTensorFlow.js models

from the ones currently\nshown on this slide

read the documentation, and\ntry the live demo of each

to get yourself a feel for\nthe inputs the model expects

such as image, text, or\nsound, along with the outputs

Now, some parts of\nthe documentation

might seem overwhelming at\nthis stage, but fear not.

You will learn how to integrate\na model into a real web

application later on in\nthe chapter step by step

So no coding is\nrequired right now.

I just want you to\nfamiliarize yourself

with the models\nthat are available.

And then, of course, you\ncan answer the questions

What inputs does the model\nneed and what outputs

What problems in your\nor someone else's life

can it solve if you were to\nuse it in a real application?

And finally, did the model\ndemo perform well for you?

Share some examples\nof when it did

or when it did not work\nwell, along with how

you might be able to\novercome those limitations

For example, maybe you find\nthat the estimated pose

points move around slightly\nbetween webcam frames.

You might choose to average\nthe found coordinates over time

Or maybe you're\nusing an older device

and the model runs\nslower than expected.

Remember, as you're\nrunning on your machine

everyone will have a\nslightly different experience

based on the hardware that\nyou've got available to you.

Maybe you can change the user\nexperience to account for this.

Or if a demo supports it,\ntry a different backend

to execute the model\non different hardware

such as the CPU\nor graphics card.

So head on to the next section\nand share your findings

↑ Return to Top ↑

2.5: 3 ways to use Machine Learning on the web with TensorFlow.js

DMCA

3.2: Selecting an ML model to use

DMCA

Android TV: Using the Leanback library

DMCA

AutoDraw: Fast Drawing for Everyone

DMCA

Build reactive mobile apps with Flutter (Google IO 18)

DMCA

Developer keynote (Google IO 23)

DMCA

FadeInImage (Flutter Widget of the Week)

DMCA

Google Drive SDK: Managing your Drive files with Apps Script

DMCA

More videos from channel "Google Developers" >>>