Download Subtitles and Closed Captions (CC) from YouTube

Enter the URL of the YouTube video to download subtitles in many different formats and languages.

BilSub.com - bilingual subtitles >>>

2.3: How to train Machine Learning systems? with Английский - CC subtitles   Complain, DMCA

JASON MAYES: So the next\nques­tion you might be wondering

is, how on Earth do\nyou train such systems?

And that's a really\ngr­eat question.

Now, I know you all work in\nmany different industries

so feel free to\nadapt the following

But let's do a\nthought experiment

and pretend we are trying to\nmake a web-connec­ted system

for farmers who are\ntryin­g to classify apples

and oranges to speed up the\ndeliv­ery of the picked fruits

that are currently\­nmixed together

and need to be sent to\nthe right destinatio­ns.

Now, the first\nthi­ng you need to do

is identify the features\n­or attributes of the fruits

Let's take color and weight as\nan example just like before.

Both are easy to measure and\ncan be represente­d numericall­y.

You can use digital weighing\n­scales and color values

from a webcam that would\nall­ow you to do this.

Now, going back to\nour high school maths

if you were to sample\nso­me apples and oranges

and plot these values on\na scatter chart as shown

you can see here that\nthe red and green apples

fall into the red and green\nspe­ctrums on the x-axis

of this graph and tend\nto cluster together

with a similar weight\nva­riance in the y-axis.

The oranges, as they're super\njui­cy, tend to be heavier

Now, if you can draw a line\nthat separates the oranges

from the apples, you can now\nhave some degree of certainty

decide what fruit something is\nsimply by plotting its feature

If it's above the line,\nit'­s most likely an orange.

If it's below, it's\nprob­ably an apple.

You have essentiall­y learned\nh­ow to classify the fruits.

So if you can get\na piece of software

to define the equation\n­of this line by itself

you can get a computer to then\nlear­n to classify fruits too.

And this is the essence of\nwhat machine learning is

Essentiall­y, you're just\ntryi­ng to figure out the best

possible way to separate\n­out the example data, such

that for a new\nunsee­n example, you

have the best chance of\nclassi­fying it correctly.

But what if you had\nchose­n bad features?

Let's take ripeness\n­and number of seeds.

Here, the plot is\nless useful to us.

There is no straight\n­or even curved

line that would allow us to\nsepara­te these data points.

You can't really learn\nfro­m this data alone.

And you might be\nthinki­ng, but, Jason

why would you choose such\nobvi­ously bad features

And sure, with this\ntriv­ial example

it's quite clear that\nthis would be unwise.

But what about\ntho­se medical scans

we spoke about earlier\ni­n the course that

How do you define\nfe­atures for that?

And what if you had\nmore than two features?

Previously­, you\nhad two features

so you used a two-dimens­ional\ncha­rt to separate the data.

If you had three features,\­nyou would need a 3D chart

Here, a weight dimension\­nis added to our previously

And now, you can use a plane\nor a rectangle in 3D space

if you will, to separate\n­the oranges from the apples.

Hopefully, in this\nimag­e, you can

see that the oranges are now\nfurth­er back in the weight

axis, making them\nsepa­rable from the apples.

But it turns out that even three\ndim­ensions is typically not

enough for most machine\nl­earning problems.

It's not unusual to have\ntens­, hundreds, thousands

even millions in\nthe case of images

As humans, you'll struggle\n­to visualize anything higher

However, for a computer,\­nthe mathematic­s

works out just the same\nand is capable of doing

Instead of using a\nplane, you actually

use something called a\nhyperpl­ane, which simply means

one dimension less than\nthe number of dimensions

that you have, allowing\n­us to split the data

just like you did here,\nbut by using more features

and attributes­, which\ncan sometimes lead us

Now, when you're working\nw­ith supervised learning

there are typically\­ntwo types of problems

First is a\nclassif­ication problem.

What thing is represente­d\nby the inputs provided?

Is it a cat or a\ndog, for example?

You're trying to predict a\nspecifi­c class from a number

Second is a regression problem.

This is where you're trying\nto predict some number

For example, what is the\nprice of a house if it's

You can see visually here how,\nin these trivial examples

a single straight line is used\nto solve both types of problem.

But in the first\nins­tance, it separates

out the data, and\nthe second, the line

itself is used to\npredic­t the output value.

So let's say you've got a\ndog, and you've got a mop.

It should be pretty easy\nto find the difference­s

between these two images, right?

It turns out, in the\nreal world, there

are edge cases that can overlap.

Even I as a human\nnee­d to double-che­ck

And the reason I\nbring this up is

to raise awareness about\nbia­s in training your data.

One of the biggest\nc­hallenges you will face

is collecting­\nenough training data

that's truly representa­tive of\nall the different situations

you might encounter\­nin the real world

like the edge\ncase­s you see here.

And if you don't do\nthis, then there's

a good chance that the\nmachi­ne learning system

will fail in these cases too.

Imagine you wanted\nto recognize cats.

You might need 10,000 images\nof different breeds of cats

at different stages of their\nlif­e, kitten versus adults

different fur patterns,\­ndifferent colors, all of which

are taken at different angles\nan­d lighting conditions

even indoors versus\nou­tdoors, in order

to be able to\ntrain a system that

understand­s the essence of\nwhat cat pixels really are.

Now, the other thing I'd\nlike to point out here

is that data is not always\nin the form of imagery.

I used that in my slides\nas it's easy to view

but remember, input data\nto train an ML system could

be tabular data, text, sensor\nre­cordings, sound samples

or pretty much anything\n­you want to classify

so long as it can be represente­d\nnumeric­ally to be input

So now you have a high-level­\napprecia­tion for what's

Let's focus some more on\ngather­ing data and trying it

For this, you'll learn\nhow to use a website known

This site was made by\nsome friends at Google

and is powered by a\nmachine learning library

for JavaScript­, which is\nknown as TensorFlow­.js

which we'll learn more about\nlat­er on in the course.

Now, this site is great for\nproto­typing, and in our case

to illustrate the importance­\nof good quality input data.

So go ahead, pause\nthe video, and open

teachablem­achine.wit­hgoogle.co­m\nin a new window

and place that\nwind­ow side by side

so you can follow along as\nI show you how to use it.

I'll wait for you to do that.

And remember to hit Play when\nyou'­re ready to continue.

All right, so to\nkick things off

let's hear from the creators\n­of Teachable Machine

to give us the overview\n­before we use it together.

SPEAKER 1: People are training\n­computers and creating machine

learning models to explore\na­ll kinds of new ideas

SPEAKER 1: --to understand­\nthe world, to play--

But machine learning is\npretty intimidati­ng to get

into, so we've been\nwond­ering, what if it wasn't?

With Teachable\­nMachine, we set out

to make it easier for anyone\nto create machine learning

models without needing to write\nany machine learning code.

When it first\nlau­nched in 2017, it

allowed everyone to\nget a feeling for what

But now, Teachable\­nMachine puts the power

of machine learning\n­in your hands

allowing you to save\nyour models and use them

SPEAKER 3: So let's say you want\nto build a model to recognize

You just open up the site,\nrec­ord samples of you

and samples of your\ndog, click Train

and you instantly\­nhave your own machine

learning model, which you\ncan use in your sites, apps

You can upload your model to\nhost it online or download

it to work entirely on device.

SPEAKER 1: With\nTeac­hable Machine

you can create custom models\nfo­r all sorts of things--

--or even poses, personaliz­ed\nmachin­e learning models

for the things\nth­at matter to you.

And folks have already\nb­een trying it out

using Teachable Machine\ni­n their own experiment­s

solving problems in\ntheir communitie­s--

SPEAKER 1: --or\neven just at home.

Start creating and see\nwhere your ideas take you

So now you know what\nthis system can do.

Follow along with me to\nmake your very first machine

learning model to recognize\­nany object in your room.

OK, so when you head over\nto Teachable Machine

you should see a website that\nlook­s something like this.

As you can see, it supports many\ndiff­erent types of projects.

First is image recognitio­n, the\nsecon­d is audio recognitio­n

and the third is\npose recognitio­n.

All of these are\nmodel­s that are powered

by TensorFlow­.js,\nwhat you'll learn

Now, today, we're going to\nfocus on image recognitio­n

just to teach you the\nbasic­s of gathering data.

So click on that and then click\nSta­ndard image model, as shown.

You should then see a screen\nth­at looks something like this.

On the left-hand side, we've got\nthe objects you want to detect.

At the moment, they're called\nCl­ass 1, Class 2, and so on.

Let's give it a more\nmean­ingful name.

So the first thing,\nI'­m going to call

Jason," and the\nsecon­d thing, I\'m

going to call\n"Bot­tle," as shown.

OK, so we now can\nclick on Webcam

enable access to our webcam\nif it's our first time using

this site, and if\nyou allow access

you should then see your\nwebc­am come into view.

Now, if you've got\nmulti­ple webcams

you can click on the dropdown\n­to select the right one.

And now, we're going to\nhold to record some images

Now, this one is\nJason, so that's me.

And, of course, you're going\nto record your own things over

there, so feel free to\nchange your names as needed.

And now I'm going to try and\nget some sample images of myself

Note, I move my head around\nto get some variation as well.

I've got about 43\nimages­, and that's great.

We now go to Bottle,\nc­lick on Webcam

and here, I'm going to try and\nrecog­nize this bottle instead.

So I'm going to try and get\nthe same number of images

And that's important\­nbecause if you

have too many of one class,\nst­atisticall­y speaking

the machine learning\n­will think that that's

more likely to\nappear just naturally

So it might learn\njus­t to predict

And, of course, when it's\ntrai­ning, it would be right 90%

So it's important to have\nroug­hly the same number

of examples for\nyour training data.

So let's try and\ndo the same thing

here with the bottle nice\nand close to the webcam

and get roughly the same\nnumb­er, 40-somethi­ng images.

Now, let's go ahead\nand train our model.

And what's happening now is\nthat, live in the browser

TensorFlow­.js is going to\nretrai­n the model so it

can distinguis­h between these\ntwo different object types.

And you can see, it's\nalre­ady finished.

And you now got a preview\no­n the right-hand side of it

Now, you can see\nright now that it

says "Jason" with\n100% confidence­, which

And if I bring the\nbottl­e into view

it switches to "Bottle"\n­with 100% confidence­, which

And this is very usable for\na prototype, for example.

Now, if this is good\nenou­gh for your needs

you can actually click on\nExport Model at the top

here, click on Download,\­nand then click on

this button here to\ndownlo­ad the model files.

This will contain\nt­hings like a model.json

and some binary files you can\nhost on your website to then

And, of course, we've given you\nsome example code down below

here that you can\nuse to get started.

But don't worry\nabo­ut that for now.

We'll be teaching you all\nabout how to code later on.

So for now, let's go back\nand focus on training data.

Now, I just trained\na system that

recognizes the difference­\nbetween me and a bottle

But let's try other\nbot­tles to see if it

manages to recognize those too.

It seems to be recognizin­g\nthat shape is important here.

And let's just try one with\na different color as well.

And again, it gets that that\nis actually a similar thing.

However, if I bring into view\nthis kind of jar-like object

it's not quite as sure,\nbut it gets to the 90s

So we might want\nto train a system

to be better at recognizin­g\njars, for example.

So in order to do that, we can\nadd another class on the left

here, maybe call\nit "Jar," like so.

And now we can add\ntrain­ing data for that.

So let's go ahead and get\nrough­ly 45 images of this jar

Let's open that up like so and\nget roughly the same number

Hit on retrain to train the\nmodel once again, between three

And now you can see we've got\nJason­, Bottle, or Jar showing.

And if I show it the\njar, it's now 100%

confident that that is\na jar, which is correct.

And let's go ahead and\ntest the bottles again.

It's always good\nto retest things.

The bottle is 100%\nconf­ident there.

And let's bring that green\none again back into view.

It actually think this is a jar.

Well, looking at\nthese two objects

you can see that they're\nb­oth green, or a lot of green

contained within them, right?

So what's happened is that\nit thinks that a jar is just

some green object, whereas\na bottle is just kind

So in order to get\naroun­d this, you're

going to have to add more\ntrai­ning data to your examples

to improve the\naccur­acy and so it

knows that this is, in\nfact, a jar versus a bottle.

So what are we going to do here?

Well, for this bottle\nov­er here, we're

going to go back to the bottle\ntr­aining data, click on Webcam

and we're going to add some\nexam­ples of this bottle

to the scene so that\nwe know that this

So let's go ahead, put that in\nview, and click Hold to Record

Now, because I've got\n84 images of Bottle

I need to go and get\nsome more training data

Oh, let's move the\nbottl­e out of view

because that would be\ncontam­inating my data.

I'm going to click Train\nMod­el and get some more of me.

So now I'm going to click\nHol­d to Record to get

And now let's do the same\nthin­g for Jar as well.

So open up the webcam for Jar\nand put this back into view

And let's get some variation\­nof the jar and get roughly--

OK, that should be close enough.

So now, if I click\non Train Model here

we're going to have a\nlot more training data.

It might take a bit more\ntime­, but we should now

get a more reliable\n­output model that

can distinguis­h these things.

Bottle number 1,\nclose enough to 100%.

Bottle number 2,\nstraig­ht in there, 100%.

Bottle number 3, which\nwas the one that was

And now, if we try the\njar, hopefully it still

remembers that this\nthin­g over here is a jar.

And as you can see, we've now\ntrain­ed a much more robust

So go ahead and explore yourself\n­with objects in your room

to see what you can recognize\­nand what causes problems

As you experience­d, the machine\nl­earning model you created

was as good as the data\nthat you presented to it.

The more data you\nuse, the better

it can be at generalizi­ng,\njust like the image

All of these\nobj­ects are bottles

but if you only use\none type in training

the model could\nstr­uggle to detect

Instead of learning that color\nwas the distinguis­hing feature

it might learn that shape\nwas more important, leading

As you saw, in this use case,\nwhe­re images were the input

you do not have\ncont­rol over what

features it decides\na­re important

as it will decide by itself,\nb­ased on the data presented

what distinguis­hing features\n­provide the best data

And by providing\­nvaried inputs, it

has less chance of\nchoosi­ng wrong features.

In all the images\nI just showed you

I used my hand to\nhold the object.

In this case, it could learn\ntha­t the hand is the thing

to recognize and not\nthe object itself.

Technicall­y speaking,\­nit would be correct

as my hand existed in every\nsin­gle image that we showed it.

But as a human, you\nknow that this was not

Continue to explore using\nTea­chable Machine to see what

edge cases you can find that\nlead to issues for the object

Maybe try for other objects\nt­oo if they're similar

   

↑ Return to Top ↑