Hello again欢迎来到Happy Hour英文小酒馆。关注公众号璐璐的英文小酒馆,加入我们的酒馆社群,邂逅更精彩更广阔的世界
Hi, everyone. And welcome back to Geek Time. Hi, Brad.
Hey Lulu.
Brad, you are currently in Japan, right?
Correct. I just moved to Japan. I've been here a few months now.
Is that gonna be a long term thing?
Possibly, still I’m a student now, but I am looking for full time work. And so once I figure out that, then I’ll make a move.
So that's just gonna be your new home. And then we're doing this recording remotely, but we're still gonna be talking about geeky or tech-related subjects or topics. And what are we going to talk about today?
I thought we’ll talk about big data.
Big data大数据.
It's one of those things that everyone has heard of, everyone talks about. But if you ask people what exactly is big data, not everyone can actually come up with the definition or not everyone knows the ins and outs of it, right?
It's a little bit difficult to get into.
So first of all, what is big data?
A lot of times when people hear the word “big” like big pharma, things like that, they think of a big company; and big data is not that, big data is just a large amounts of data.
Traditionally, data was very small sets of information that people could put together and sought through, and like find out information about their customers or something like that. But now with the explosion of like being able to store huge amounts of data and sought through larger sets, we get something called big data.
Before, for example, it’s just each store would collect their own data, each business, but now everything is connected.
Correct! Like everything, a lot of people that have data online will trade or sell their data to other people.
But it's not just about the amount, right?It's not just the volume that is big. When you talk about big data, it's…就是大数据并不只是一个量的概念, is it also about, for example, what sorts of data, the range of it?
Like when you look at big data, they're looking at various things, but you know kind of like boil it down to three Vs, variety, volume and velocity.
Variety is where you're getting the data from. You have a variety of different sets of data. Volume just has to do with the amount of data you're collecting. It's not just like small data sets with like a small group of people from one little area in one city. It's like people from the entire city.
And then velocity is like you're not just collecting data at one time, you're collecting data at fast rates over time.
I see. 三个V也就是 Variety是多样性, Volume是整个的量比较大,和Velocity是速度. All of these basically are the features of big data.
Yes.
But usually when I think of big data how it is used, the first thing that jumps to my mind as just a regular person, then I think of me buying things online. And then that is my connection with big data because my personal data is being collected. But what exactly is big data used on or used in nowadays?
Big data can be used in a variety of different things. When we look at big data, we're using it for like predictive analytics, predicting future trends and things like that. We're looking at behavioral analysis where we're looking at how consumers are going to behave.
And we can even get into some of the things where we're talking about advertising. What should we advertise to this person? We can use that information to target a customer and say this is what you want, and maybe even use it to predict crime and things like that.
To predict crime? So it's like based on this person's behaviors in the past, he or she is likely to commit this type of crime.
Yeah, so it can be that or it can also just be like crime statistics in particular areas, when crime happens, how to staff the police. But it can get scary into that predictive where this person is likely to commit a crime based on their geographic locations, their salary, and things like that. Yeah,
Yeah. That is a scary thought, but even just from a consumer’s point of view, you think of big data and you think of precision marketing就是现在说的精准营销. They know who I am because they have collected so much data from me. So they are likely to predict what kind of products I will be buying, when I will be buying it, how frequently I’ll be buying it.
Yeah, they know when you're gonna buy something and how often you're gonna buy something. And so they know like companies like Amazon have started to not just sell products but get like schedules on when you should have your next product and things like that.
So when you buy a product, you can start scheduling when you're going to get the next one. So you don't have to worry about ordering it yourself.
I see. And we also see big data used in media, what's that about?
When you're looking at media, it's not just advertising for products. Companies target their customers like when it comes to any type of news outlet or YouTube videos, when you go on YouTube or like Bilibili, they're predicting what type of content you're gonna want to watch and so they make suggestions based on what you're looking at.
News companies get to the point where they're only going to talk about stories that's going to get your attention. They don't want you to turn off the TV, and so they're gonna get things that make you want to watch more.
Actually, now that you've mentioned it. In Chinese, we say千人千面, which means you and me, so Brad, you and me if we log on to the same… if we're visiting the same app or the same website like YouTube or Bilibili, your interface like your… from the page is going to be very different from mine because obviously you have been watching different things. Your interest is in different areas, so they will push those to you that they think you'll be interested in, but pushed to me what I will be interested in.
Whenever I buy like a new computer and I go to YouTube for the first time before I log in and look at the front page that they have suggested to just the average person who hasn't logged in.
I’m like, what are all of these things on here? And then once I log in, like it completely changes.
Yeah, but that is also why people are getting this kind of false information. They feel like the whole world is interested in exactly the same thing; or they might say, recently, I like watching, let's say, videos of kittens, and then why is every other video clip about soft little kittens?
But it's actually… big data has been trying to target you right there, trying to push these to you.
Right. They're looking for things that you're going to click on. And so they write articles and ways to get you to click on it, or they push content that you're going to click on because advertising or those companies don't get paid unless the advertisers pay them; and advertisers want people that are interested in their products to see their products. And so they have to build these profiles of people, and so they're pushing content that you're going to see. So you see an advertisement.
现在很流行的一个说法叫user profiling或者consumer profiling, 叫做用户或者消费者画像.
And largely, that is based on the use of big data because otherwise they cannot profile you.
Right. It can get even more scary because like some news organizations, they're not just choosing which new stories that they're giving you. They're also choosing the spin of the new story that way, when you actually see the news story, it's not just what news they're talking about, but how they spin it.
Wow.
If you look at like two news companies in the US like CNN and Fox, they'll talk about the same story, but they'll be talking about it in a completely different light or from a different angle.
I see就是不同的观点, 比如说一个支持, 一个反对。
One news agency is supporting it, the other is against it, but they are basing on the big data they collected.
Yeah, what their consumers or what their people are...
And then they’ll pushed the one that you're likely to agree.
Exactly.
Scary thought. Obviously, we're gonna talk more about that in the advanced episode. But so far it just seems like we're stuck in business and media. It's all about pushing things to you, making you buy things. I’m sure it's used in more productive areas, like science itself.
We can use big data in science, and science is actually one of the reasons why like big data started to come about, like when scientists are looking for information, like when we're looking at decoding DNA, there's a lot of information there, it isn’t just one person's DNA, that can take years and years. I think the first time they decoded DNA, it took a decade.
But now they're able to decode someone's DNA in a matter of several hours or maybe even as long as… the longest maybe a day. But aside from that, we're looking at like the large hadron collider, they've got thousands and thousands of sensors that they're collecting data from all at once, but they're also collecting the data over long periods of time to see all of these, all the things that are happening within their experiment.
And then they have to collect all that data. And then they have to look through the data, but one person can't look through all of that. And so they have to like collect the data and tell the computer what to look for.
And then the computer can give them the information they're looking for. It can also be used when we go into like climate. We can collect a lot of climate data and we can go back and we can look at all the data we've had from the past. And we can look at it more intensely because we can put it into a larger model, not just in one area, but all over the world.
Globally, and also historically.
So you sort of…going back to this three Vs you are talking about, variety, volume, and velocity. You're able to process and analyze a really huge amount of data in order to generate perhaps more convincing or more useful results.
Right.
Okay. I think it's pretty straightforward. We're gonna wrap up the basic episode on big data here. In the advanced episode, we're gonna go more into this topic and talk about the benefits, difficulties, and the issues and concerns with big data.
Thank you, Brad, for coming to the show.
Thank you, Lulu. I’ll see you in the next episode.
We'll see you next time. Bye.