Google+ Archives | Datafloq https://datafloq.com/tag/google/ Data and Technology Insights Thu, 01 Jun 2023 07:12:36 +0000 en-US hourly 1 https://wordpress.org/?v=6.2.2 https://datafloq.com/wp-content/uploads/2021/12/cropped-favicon-32x32.png Google+ Archives | Datafloq https://datafloq.com/tag/google/ 32 32 Analyzing Search Results In The Era Of Big Data Analytics: An Evaluation https://datafloq.com/read/analyzing-search-results-era-big-data-analytics-evaluation/ Wed, 11 Aug 2021 14:39:22 +0000 https://datafloq.com/read/analyzing-search-results-era-big-data-analytics-evaluation/ Google It! This phrase or term no matter how you would like to call it has changed our world. In fact, the internet, search engines and social media have become […]

The post Analyzing Search Results In The Era Of Big Data Analytics: An Evaluation appeared first on Datafloq.

]]>
Google It!

This phrase or term no matter how you would like to call it has changed our world. In fact, the internet, search engines and social media have become such an inextricable part of our lives, that we cannot imagine a world where they did not exist some decades back.

With technology, software and digital ecosystems becoming a part of our lives, information became abundant. This information that can be collected, processed, analyzed and stored is referred to as Data'.

Big Data is a phenomenon that involves the storage, processing and analyzing of huge volumes of information in a structured fashion. Big Data Analytics is a field of expertise that excels in finding trends, making projections and issuing forecasts.

The advent of Big Data Analytics in the field of optimizing search results has not been discussed much. In this article, we are going to help SEOs, agencies and brands by highlighting how understanding Big Data Analytics can help in improving search result performance.

Big Data Analytics and Google: What we need to know

At its heart, the California-based organization is the biggest data company in the world!

With relevancy and correct information being at the heart of Google's new search policy, Big Data is rising in prominence. Google does not want to show its users things that are similar. It wants to hit the nail bang on the head.

Google realizes that it can do so, by taking help from Big Data Analytics, Artificial Intelligence, Machine Learning and other advanced technologies.

You might turn around and say, ok, but how is all this related to Big Data anyway?

Search engines like Google read and processes yottabytes of data on an everyday basis. It takes into account everything that the users do when they are online

  • Kind of content and websites they engage with
  • Amount of time they are visiting on the sites
  • Number of times are they scrolling through the web page
  • Cross-relating information and data with social media platforms
  • Checking Technical Optimizations of web pages and websites

All this helps Google create a data bank and then run complicated AI learning. This helps Google give to the users the best possible search results for their queries.

Content and Big Data Analytics: The Relationship

At the end of the day, data is information, which by itself is content.

This means that the best content is analyzed by Google as the most optimum form of data. If you are someone new to the field, you need to understand content as anything that has been published online.

As millions of pages of content are created every day, Google breaks it down into data for better understanding and processing. Experts and insiders who discuss how Google works have pointed to something called a Semantic' understanding of content.

Semantic search refers to a process where the aim is no longer to just equate and understand keywords, but also to fix truth, facts and relevancy to the same.

Let us try to explain the same with an example

Imagine you are a technology brand that manufactures televisions. A standard SEO strategy for such a brand would include targeting keywords relevant to the industry and link building on the same to climb the SERPs. While this would help the brand climb the SERPS, there is a lot more involved. Google will try to find mentioned on the brand name all over the web.

If it comes across a lot of negative content on the brand in the form of criticisms, poor products and services, it's Big Data Analytics will factor in these negative statements. Google as a search engine will then feel that since this brand has a lot of negative comments over the web, maybe it is not the right brand to show to the audiences. In such cases, it will remove the brand from the keyword list. Hence, it is important for the brand to take down negative search engine results.

Can Search Engine Optimization take help from Big Data Analytics?

In the last two sections, we spoke and discussed how search engines are using Big Data Analytics to optimize results.

In this section, we will show how SEOs, brands and agencies can do the same to tailor-make content that search engines love. This will help them rank their branded content on the SERPs.

As a digital marketing strategy, SEO is constantly evolving. Its methods, best practices and strategies need to be changed according to how search engines are behaving.

This is why marketers, SEOs, brands and agencies need to understand what search engines are moving towards. In this regard, Big Data Analytics can help create the best possible content that not only ranks, but builds credibility, pushes sales and creates brand awareness.

Businesses need to take help from tools that can highlight important data sources. This includes everything from keyword research tools to looking at social media analytics. Understanding data and how it is processed can help create the best content that will be picked up by search engines.

Tools and Software used in SEO Strategies that can help in Big Data Analytics

1. Keyword Research Tools

The first and probably the most important tools that are available to SEOs are keyword research tools. Just to name a few, marketers need to start analysing quality tools like AHREFS, MOZ and SEMrush to help them find the best keywords for their brands.

This is based on the number of searches that people are doing and listing down the authority pages that are ranking for the keyword.

It can also show how many links you would need to outrank someone, along with showing ten other different functions and features that can help in optimizing for search results.

2. Rankings and Analytics Tools

According to leading marketers, Google itself helps SEOs with the best rankings and analytics tool– Google Search Console. This is a very superior ranking tool that shows you where you stand.

It can help show you which pages on your site are ranking, what is the kind of impression they are generating, along with highlighting issues and problems. In a nutshell, the GSC shows you everything that they take into account when it comes to ranking.

Following the same can show you a clear path about the next stage of strategies. GSC results already take into account the pages crawled, site map analysis and more.

3. Content and Research Tools

Last but definitely not the least are content tools. These content tools are targeted towards giving you insights on two major ends- the real physical user and the search engine.

A piece of content needs to be as attractive to physical human beings as it is to the algorithms of search engines. While there are always tools like AHREFS that can help you with content strategy, you need to start focussing on platforms like BUZZSUMO as well.

From doing the best topic research to cross linking to authoritative sources like Quora, Reddit and Medium, content tools are very important.

Benefits of using Big Data Analytics in SEO Strategies

In this section, we are going to highlight four major ways how using Big Data Analytics can help in improving SEO KPIs-

i. Building Trust and Credibility

Every brand wants to build trust within its target audiences. If you as a brand are not building trust, no one will feel encouraged to start buying from you. Using data analytics can help you assess areas where you need to build trust.

Social media advertising and Google AdWords are already using cookies and retargeting to help marketers. With the reliance on digital platforms growing, brands will be able to harvest more customer data that can help them with better targeting.

ii. Better User Experiences

Site owners need to create the best user experiences to help their SEO performance. Here again, data analytics can help play a major role. It can inform website owners about what is good and what areas they need to improve on.

From data on location-based targeting to creating customized tastes and preferences, everything is targeted on the customer's previous search experiences. If you analyze eCommerce businesses and platforms, you will see that this is the way forward.

iii. Deeper Insights and SEO Analytics

Generating better insights and awareness about what works and what doesn't is a critical component of SEO. You need to use big data to grow your learning and help you make better decisions.

There are so many diverse SEO strategies that can be explored if you have the right data in front of you. For example, an analysis into long-tail keywords showed that users that used them had a higher buying or purchase intention. There are hundreds of other insights on offer.

The Final Word

Both search engines as well as SEOs are relying on big data to show them the way forward. For search engines, it is all about showing accurate and factual information as possible. For SEOs, it is about doing what audiences and search engines love best.

In both cases, it is Big Data Analytics that will help show the way. When it comes to making informed decisions like pursuing strategies or making investments, data is at the core of what we have to do. Putting investments behind strategies that data shows do not work is foolish.

In the next few years, the drive towards search equity is something that Google will pursue. This will again be dependent and become a reality only on the basis of Big Data Analytics. The search results discipline and field will once again change.

There is a reason why Clive Humby referred to Data as the New Oil. Do you agree with his statement? Let us know in the comments section below.

The post Analyzing Search Results In The Era Of Big Data Analytics: An Evaluation appeared first on Datafloq.

]]>
7 Ways Automation and Machine Learning Will Change Google Ads https://datafloq.com/read/7-ways-automation-machine-learning-will-change-google-ads/ Thu, 11 Feb 2021 20:06:41 +0000 https://datafloq.com/read/7-ways-automation-machine-learning-will-change-google-ads/ In 2017, Google's CEO, Sundar Pichai, announced their shift toward an AI-first future. For digital marketers, ML and AI technologies can be a game-changer. With artificial intelligence, advertising platforms can […]

The post 7 Ways Automation and Machine Learning Will Change Google Ads appeared first on Datafloq.

]]>
In 2017, Google's CEO, Sundar Pichai, announced their shift toward an AI-first future. For digital marketers, ML and AI technologies can be a game-changer.

With artificial intelligence, advertising platforms can collect user data and help companies connect with target audiences more effectively. They minimize menial tasks and help you create data-driven and relevant ads.

Here is how to use automation and machine learning to boost your Google Ads campaigns.

The Introduction of Responsive Search Ads

Today's customers are tech-savvy and more demanding. With the rise of wearables and mobile advertising, online users expect to get things done faster. Above all, they want brands to provide highly personalized, relevant, and helpful ads.

That is where responsive search ads can help. They allow marketers to create ads that show more text to customers. Responsive search ads combine your creativity with ML to provide exceptional results.

With responsive search ads, you can create up to 15 headlines and four description lines. Google will do the rest. The platform uses ML to test content combinations.

Google Ads strives to identify which ad combinations perform best for various search queries. So, even if users are googling the same thing, they may see different ad content based on context.

That way, you ensure your ads match your target audience's search terms and intent. That helps you boost campaign performance and maximize conversions.

Google's internal research results back me up on that. Advertisers using Google's machine learning to create personalized ad campaigns generate up to 15% more clicks.

Boosting Foot Traffic with Google's Local Campaigns

Online customers still make most of their purchases in brick-and-mortar stores.

Surveys found the same. With the evolution of intelligent voice search, mobile searches for near me have grown. Moreover, 80% of customers using local keywords go to a local store after performing online research.

Therefore, Google Ads are an opportunity to drive foot traffic to your physical store. That is where its Local Campaigns can help.

When creating a Local Campaign, you only need to define store locations you want to promote. You do that by linking your Google My Business listing.

Google uses ML technologies to optimize bids, ad placements, and asset combinations. That way, it helps you promote your brick-and-mortar stores online more effectively.

Leveraging Automated Ad Extensions

When Google's AI algorithms estimate that an extension can boost your ad performance, they create it automatically.

There are seven types of automated ad extensions, including:

  • Call extensions
  • Message extensions
  • Sitelinks
  • Structured snippets
  • Location extensions
  • Seller ratings
  • Callout extensions

However, with automated extensions, you will have less control over your ads. Google can automatically replace your copy with its AI-generated solution. Always consult your AdWords agency before implementing them.

Increasing the Optimization Score

The Optimization Score is one of the most crucial Google Ads metrics. It measures your Google Ads account's performance. It ranges from 0% to 100%.

Using ML and AI algorithms, Google Ads assesses your campaigns to check whether they are fully optimized. It measures the score in real-time. The system considers multiple factors, settings, your account, campaign status, and recent recommendations history.

Along with the performance score, you also see a list of recommendations. They help you optimize your campaigns. Additionally, you will learn how applying those suggestions will impact your optimization score.

The optimization score impacts several aspects of your Ads campaign, including:

  • Bids and budgets
  • Keywords and targeting
  • Ad copies
  • Ad extensions
  • Possible recommendations

Harnessing the Power of Automated Bidding

You do not have to worry about manual bidding anymore. Machine learning automates the process. By streamlining your bidding efforts, you can eliminate guesswork and make informed decisions.

Now, there are several automated bidding types to apply, including:

  • Maximize Clicks help you boost the CTR.
  • Target Impression Share ensures your ads appear at the top of Google's SERPs.
  • Target CPA sets search bids to enhance conversions.
  • Target ROAS maximize your return on investment.
  • Maximize Conversions helps you get the most out of your conversion value.

Smart bidding uses machine learning to help you align bids with your advertising objectives. ML technologies analyze signals, such as languages, operating systems, or time of day, to understand the context of searches.

Setting Up Smart Campaigns

AI-driven Google Smart campaigns are perfect for optimizing Google Ads campaigns in the long-run. They benefit your online presence on multiple levels.

First, they provide automated maintenance. Set your ad campaigns, and Google will do the rest. It takes care of keyword selection, bidding, and targeting. That way, it helps you focus on more demanding aspects of your Google Ads campaigns.

Second, Google will take care of copywriting. You only need to provide it with relevant product information.

Personalizing CX with Smart Shopping Campaigns

Big data and digital advertising are a match made in heaven.

Google's Shopping campaigns prove that. Shopping ads include rich product information, such as images, price, merchant name, and reviews.

The role of machine learning in his process is immense. ML technologies estimate the likelihood that a click will result in conversions. They adjust bids accordingly.

AI and ML also determine where your Shopping ads will show up and which products they will feature.

AI and ML are the Future of Google Ads

In the world of PPC advertising, delivering relevant and personalized ads is crucial. To harness the full potential of Google Ads, you need to have a lot of technical experience and knowledge.

Artificial intelligence and automation change that. These technologies reduce manual work and help you stay agile in the hypercompetitive digital advertising ecosystem.

Do you use ML and automation to boost your Google Ads campaigns? Please, share your experiences with us!

The post 7 Ways Automation and Machine Learning Will Change Google Ads appeared first on Datafloq.

]]>
Why the arrival of Google Stadia may lead to big things for augmented reality https://datafloq.com/read/why-arrival-google-stadia-may-lead-big-things-augmented-reality/ Thu, 27 Aug 2020 14:31:38 +0000 https://datafloq.com/read/why-arrival-google-stadia-may-lead-big-things-augmented-reality/ In March 2019, Google made an announcement that had huge ramifications for the wide world of gaming. Stadia is on its way and with it comes a brave new era […]

The post Why the arrival of Google Stadia may lead to big things for augmented reality appeared first on Datafloq.

]]>
In March 2019, Google made an announcement that had huge ramifications for the wide world of gaming. Stadia is on its way and with it comes a brave new era of video game streaming.

The arrival of Google Stadia has caused a stir in the gaming community, but its rather significant implications for Augmented Reality hasn't commanded much media coverage over the past few months. So now seems like an ideal time to take a deeper look into Stadia itself, and what it means for the fledgeling field of AR.

Introducing Stadia

Even without taking the effort to dip into its wide-reaching implications, Google Stadia is a revolutionary concept that challenges the very need for the console-based gaming that's commonplace today.

Stadia is entirely cloud-based, meaning that Google's servers handle all the processing power that users need before beaming a fully fledged, high-quality video game directly to their television.

With Stadia, the concept of downloading content will be a thing of the past. There are no installation processes to worry about either. Games are simply streamed directly onto any compatible device – from TVs to PCs, to laptops, to smartphones.

Subscriptions to Google's Stadia Pro service starts at around ‘ 8.99 per month and the resolution quality of the games you play is entirely dependent on the quality of your internet connection – with 4K gaming achievable for users with 35 Mbps download speeds or better.

(An example of just how scalable Google Stadia's service is for varying users. Image: GamesRadar)

The bigger picture

The significance of Stadia is that it's a project aiming to interrupt the computing hardware industry. Cloud computing has a level of potential that's only limited by consumers' respective internet connections. Using only cloud-based technology, Google believes that it can enable users to stream games with more-than twice the processing power of the PlayStation 4.

Google isn't renowned for its hardware packages, and the bigger picture here is the company's bid to bypass the notion of utilising physical devices to enable high performance where possible. The idea of streaming powerful games through a tiny Google Chrome USB stick may sound the alarm for other more hardware oriented businesses.

Writing for Medium, Colby Gallagher explains: Apple or Samsung saying they've increased speeds by 20% on a new release of phone isn't going to mean anything next to someone that can run a full 3D CAD package on their handset. All streamed from the cloud.

Augmented implications

The announcement of Stadia will have huge ramifications for Augmented Reality, among other industries heavily involved in the world of gaming and technology as a whole.

AR is still a developing industry in its own right. We've seen widespread success with the likes of mobile apps such as Pokemon Go! but this brand of tech hasn't quite made the transition into everyday usage.

However, the cloud-based power of Stadia may well be set to equip Augmented Reality hardware with the boost it needs in emerging into mainstream usage.

The problem with much of modern AR is that it's still heavily dependent on hardware to provide users with a complete immersion into an augmented world. Microsoft‘s Hololens and Magic Leap operate with all the required computing power to be stored within the physical wearable device.

While both Hololens and Magic Leap are designed ergonomically, the headsets are bulky and clumsy at times. Magic Lens, in particular, needs users to attach a computing puck' to their belt in order for the device to work efficiently.

The arrival of Google Stadia may solve the AR industry's problem of bulky unappealing devices. Given that Stadia can handle huge levels of computing power via the cloud, hardware like AR glasses will be able to handle levels of processing power that's unfathomable by today's standards. The limitless possibilities of Stadia could ultimately result in users having the power of a gaming PC within their AR glasses while on public transport.

Such technology will enable Augmented Reality to really think big, and expand way beyond the limitations of today. Remote AR companies like Watty are already developing games and social applications that focus on bringing people together from around the world, seemingly in the same room, through advanced augmented techniques. Now imagine the level of interaction that can be achieved with the power of the cloud firmly behind their technology.

The post Why the arrival of Google Stadia may lead to big things for augmented reality appeared first on Datafloq.

]]>
Key Takeaways from Google’s Meena Chatbot https://datafloq.com/read/key-takeaways-googles-meena-chatbot/ Wed, 26 Feb 2020 10:17:37 +0000 https://datafloq.com/read/key-takeaways-googles-meena-chatbot/ This article has been penned by Swapan Rajdev, Co-Founder & CTO of Haptik Chatbots and virtual assistants have been in the news in a big way for the past few […]

The post Key Takeaways from Google’s Meena Chatbot appeared first on Datafloq.

]]>

This article has been penned by Swapan Rajdev, Co-Founder & CTO of Haptik

Chatbots and virtual assistants have been in the news in a big way for the past few days. The reason? One word: Meena.

In a research paper titled Towards an Open Domain Chatbot, Google presented Meena, a conversational agent that can chat about anything .

According to the Google researchers‘ who worked on the project, Meena stands in contrast to the majority of current chatbots, that tend to be highly specialized within a particular domain, and perform well as long as users do not stray too much from expected usage. Open-domain chatbots, on the other hand, theoretically have the ability to converse with a user about anything they want.

In practice, however, most current open-domain chatbots often simply do not make sense. At worst, they say things that are inconsistent with what has been said previously, or which indicate a lack of common sense or basic knowledge about the world. At best, they can say something like I don't know a sensible response to any query, but one which does not address the specific needs of the user.

This is the gap that Google aims to address with Meena, which they claim has come closer to simulating the experience of a conversation with an actual human being than any other state-of-the-art chatbot to date.

Meena is undoubtedly a game-changer in the Conversational AI space. But what lessons does it hold for enterprises that have implemented, or plan to implement, conversational solutions? Can brands look forward to the day when they have their own Meena's chatting away to customers and achieving hitherto unknown milestones in customer engagement? Read on as we unpack Meena and attempt to answer those questions.

The conversational experience users always wanted (but never got)

When it comes to chatbots, the expectation has always been that one can interact with them much like one does with a fellow human asking them anything under the sun and receiving an engaging response that makes perfect sense. But reality has thus far fallen short of those expectations.

Meena certainly brings us a lot closer to bridging that gap between expectation and reality. It has been described as a multi-turn and open-domain chatbot both terms being key to understanding its capabilities. Open-domain essentially means that there are no restrictions on the topics that can be discussed with the chatbot. Multi-turn means that the chatbot is capable of engaging in a conversation that involves back-and-forth between participants. Both of these form the backbone of Meena's ability to effectively simulate human conversation.

meena chatbot As an open-domain and multi-turn chatbot, Meena is capable of effectively simulating human conversation. Source: Google Blog

To mimic human conversation, it isn't enough for a chatbot to say something that makes sense. It's equally important to make sense in context. As we discussed at the start, a chatbot saying I don't know in response to a user's query technically makes sense, but it does not address the specific query the user had. This response could indicate one of two possibilities a) that the chatbot did not understand the user's query, or b) that the chatbot understood the query but genuinely did not know the answer. It is important to differentiate between the two, to gauge how good the chatbot really is at understanding.

This becomes very important when it comes to the chatbot's ability to effectively simulate human conversation. If the chatbot responds I don't know to certain queries that it simply does not know the answer to, it is akin to a person genuinely not knowing the answer to certain questions. But if a chatbot repeatedly replies I don't know to even basic queries that it would reasonably be expected to know the answers to, then that would shatter the illusion of human-like conversation ultimately having an adverse impact on end-user experience.

Google has, in fact, introduced a new metric to demonstrate how good Meena is at simulating human conversation the Sensibleness and Specificity Average (SSA). To determine the SSA of Meena, evaluators tested Meena, as well as a few other open-domain chatbots (Mitsuku, Xiaoice, Cleverbot, DialoGPT), assessing every response on the basis of two questions does it make sense? and is it specific? .

Meena's achieved an SSA score of 79%, leaps and bounds ahead of the other chatbots tested (its closest competitor, five-time Loebner Prize winner Mitsuku, scored 56%). To put this into perspective, the SSA score of an average person was found to be 86% a mere 7% above Meena's score. This certainly places Meena's ability to speak like a human' into stark relief!

Sensibleness and Specificity Average (SSA) Meena's high SSA (Sensibleness and Specificity Average) score reflects its superior ability to engage in human-like conversation. Source: Google Blog

Meena has undoubtedly come closer than any of its peers to fulfilling the long-standing expectation of being able to talk to a chatbot about anything'. Let us now examine another crucial aspect of Meena that should be significant to anyone observing the Conversational AI space.

Conversational data is the key

An open-domain chatbot's ability to engage in free-flowing conversation with a user depends to a large extent on the dataset used to train it. The larger and more varied the dataset the AI is trained on, the greater the scope of the queries it is able to address. And Meena has certainly outclassed its competition in terms of the sheer quantum of data used to train it.

Meena has 2.6 billion parameters and was trained on 341 GB of textual data (comprising 40 billion words), derived from public-domain social media conversations. Apart from the staggering amount of data involved, the key point to note here is that the data Meena was trained on was conversational data i.e. conversations and messages written by real human beings.

In an enterprise context, as we've discovered at Haptik over the years, conversational data sourced from interactions between real customers and support agents or sales assistants is particularly crucial. The best way to simulate an engaging and personalized customer support or sales experience is to train your Conversational AI on data from real customer interactions. Meena's vastly superior capabilities certainly highlight the necessity of this approach to training chatbots.

What does Meena mean for enterprise virtual assistants?

All the buzz created by Meena over the past few weeks has no doubt generated a lot of interest across industries about the enterprise applications of this highly sophisticated and interactive chatbot model.

enterprise virtual asssistant

There's certainly a lot about Meena that would be immensely beneficial to a brand looking for innovative ways to engage customers. The ability to engage in free-flowing conversation brings a naturalistic human' touch to interactions between customers and virtual assistants, significantly enhancing customer experience. Haptik's study on Virtual Assistant Personality Preference last year demonstrated the concrete impact that the chatbot personality can have on customer experience (with 67% of respondents expressing a preference for assistants with a more friendly personality). A chatbot with Meena's capabilities would certainly up the ante when it comes to exhibiting distinct virtual personalities.

Humanizing computer interactions, improving foreign language practice, and making relatable interactive movie and videogame characters are some of the possible applications of the Meena chatbot model, according to Google. It wouldn't be a stretch to add more engaging, interactive and human-like virtual sales clerks and customer support agents to that list!

So, how soon can enterprises get their own Meena's ?

Well, the realistic answer is it will take a little time.

While Meena is definitely a giant leap in the right direction for Conversational AI, implementing a virtual assistant solution of equal sophistication is a somewhat challenging prospect for most enterprises at present.

To begin with, a Meena-like enterprise assistant would require to be trained on vast amounts of domain-specific conversational data. Acquiring data is not not particularly difficult, but cleaning it up to make it usable by Machine Learning (ML) models requires a fair amount of time and effort. Haptik is fortunate in this regard, as we have over 6 years worth of conversational data and have already invested a substantial amount of effort in preparing it for our ML.

However, access to vast amounts of conversational data is a relatively simpler barrier to overcome as compared to the hardware barrier. The processing power required to train a chatbot of Meena's sophistication is tremendous, to put it mildly. It took Google's researchers 2048 TPU cores to train Meena! Google has not officially released any figures, but there are estimates online which suggest that the cost would have easily run into over a million dollars. That being said, with ML models becoming more efficient and the hardware required becoming more cost-effective, this barrier is becoming less insurmountable

Looking ahead

Meena has definitely shown us all the road ahead, broadly speaking, for Conversational AI. From the emphasis on large data-sets for training (sourced from real human conversations), to simulating more free-flowing, interactive and human-like conversations, there's a lot that chatbot developers, as well as enterprises implementing conversational solutions, can learn from Meena.

We at Haptik are excited by the possibilities that Meena has opened up in this space. Developments like these only encourage us to redouble our own efforts towards designing superior conversational experiences, and we look forward to showcasing some of our research soon.

Are you interested in developing an Intelligent Virtual Assistant solution for your brand?

Get in Touch

Related


Originally published here

The post Key Takeaways from Google’s Meena Chatbot appeared first on Datafloq.

]]>
ScyllaDB Trends How Users Deploy The Real-Time Big Data Database https://datafloq.com/read/scylladb-trends-users-deploy-real-time-big-data/ Wed, 27 Nov 2019 11:45:30 +0000 https://datafloq.com/read/scylladb-trends-users-deploy-real-time-big-data/ ScyllaDB is an open-source distributed NoSQL data store, reimplemented from the popular Apache Cassandra database. Released just four years ago in 2015, Scylla has averaged over 220% year-over-year growth in […]

The post ScyllaDB Trends
How Users Deploy The Real-Time Big Data Database
appeared first on Datafloq.

]]>
ScyllaDB is an open-source distributed NoSQL data store, reimplemented from the popular Apache Cassandra database. Released just four years ago in 2015, Scylla has averaged over 220% year-over-year growth in popularity according to DB-Engines. We've heard a lot about this rising database from the DBA community and our users, and decided to become a sponsor for this years Scylla Summit to learn more about the deployment trends from its users. In this ScyllaDB Trends Report, we break down ScyllaDB cloud vs. on-premise deployments, most popular cloud providers, SQL and NoSQL databases used with ScyllaDB, most time-consuming management tasks, and why you should use ScyllaDB vs. Cassandra.

ScyllaDB vs. Cassandra Which Is Better?

Wondering which wide-column store to use for your deployments? While Cassandra is still the most popular, ScyllaDB is gaining fast as the 7th most popular wide column store according to DB-Engines. So what are some of the reasons why users would pick ScyllaDB vs. Cassandra?

ScyllaDB offers significantly lower latency which allows you to process a high volume of data with minimal delay. In fact, according to ScyllaDB's performance benchmark report, their 99.9 percentile latency is up to 11X better than Cassandra on AWS EC2 bare metal. So this type of performance has to come at a cost, right? It does, but they claim in this report that it's a 2.5X cost reduction compared to running Cassandra, as they can achieve this performance with only 10% of the nodes.

There are dozens of quality articles on ScyllaDB vs. Cassandra, so we'll stop short here so we can get to the real purpose of this article, breaking down the ScyllaDB user data.

ScyllaDB Cloud vs. ScyllaDB On-Premises

ScyllaDB can be run in both in the public cloud and on-premises. In fact, ScyllaDB is most popularly deployed in both public cloud and on-premise environments within a single organization. The 44% of ScyllaDB deployments leveraging both cloud and on-premise computing could be through either a hybrid cloud environment leveraging both for a specific application, or using these environments separately to manage different applications.

ScyllaDB on-premise deployments and ScyllaDB cloud deployments were dead-even at 28% each. You can run both the free open source ScyllaDB and ScyllaDB Enterprise in the cloud or on-premise, and ScyllaDB Enterprise license starts at $28.8k/year for a total of 48 cores.

ScyllaDB Cloud vs. ScyllaDB On-Premise Chart - Database Trends Report ScaleGrid

Most Popular Cloud Providers for ScyllaDB

With 28% of ScyllaDB cluster exclusively being deployed in the cloud, and 72% using the cloud in some capacity, we were interested to see which cloud providers are most popular for ScyllaDB workloads.

#1. AWS

We found that 39.1% of all ScyllaDB cloud deployments are running on AWS from our survey participants. While we expected AWS to be the #1 cloud provider for ScyllaDB, the percentage was considerably lower than the responses from all cloud database types in this survey that reported 55% were deploying on AWS. This number is more in line with our recent 2019 Open Source Database Trends Report where 56.9% of cloud deployments were reported running on AWS. This may be because AWS does not support ScyllaDB through their Relational Database Services (RDS), so we could hypothesize that as more organizations continue to migrate their data to ScyllaDB, AWS may experience a decline in their customer base.

#2. Google Cloud

Google Cloud Platform (GCP) was the second most popular cloud provider for ScyllaDB, coming in at 30.4% of all cloud deployments. Google Cloud does offer its own wide column store and big data database called Bigtable which is actually ranked #111, one under ScyllaDB at #110 on DB-Engines. ScyllaDB's low cost and high-performance capabilities make it an attractive option to GCP users, especially since it is open-source compared to Bigtable which is only commercially available on GCP.

#3. Azure

Azure followed in third place representing 17.4% of all ScyllaDB deployments in the cloud from our survey respondents. Azure is an attractive cloud provider for organizations leveraging the Microsoft suite of services.

Most Popular Cloud Providers for ScyllaDB Chart: AWS, GCP, Azure - Database Trends Report ScaleGrid

The remaining 13.0% of ScyllaDB cloud deployments were found to be running on DigitalOcean, Alibaba, and Tencent cloud computing services.

Their managed service, Scylla Cloud, is currently only available on AWS, and you must use the ScyllaDB Enterprise version to leverage their DBaaS. Scylla Cloud plans to add support for GCP and Azure in the future, but with only 39% reporting on AWS, we can assume over 60% of ScyllaDB deployments are being self-managed in the cloud.

Databases Most Commonly Used with ScyllaDB

As we also found from the 2019 Open Source Database Report, organizations on average leverage 3.1 different database types. But, in this survey, organizations using ScyllaDB reported only using 2.3 different database types on average, a 26% reduction compared to our results from all open source database users. We also found that 39% of ScyllaDB deployments are only using ScyllaDB, and not leveraging any other database type in their applications.

So which databases are most commonly used in conjunction with ScyllaDB? We found that ScyllaDB users are also using SQL databases MySQL 20% of the time and PostgreSQL 20% of the time as well. The second most commonly used database with ScyllaDB was Cassandra represented in 16% of the deployments, and we could assume this is by organizations testing ScyllaDB as an alternative to Cassandra in their applications as both database types are wide column stores.

MongoDB was the fourth most popularly deployed database with ScyllaDB at 12%. Redis and Elasticsearch were tied in fifth place, both being leverage 8% of the time with ScyllaDB deployments.

Databases Most Commonly Used with ScyllaDB Chart: MySQL, PostgreSQL, Cassandra, MongoDB - Database Trends Report ScaleGrid

We also found 20% of Scylla deployments are leveraging other database types, including Oracle, Aerospike, Kafka (which is now transforming into an event streaming database), DB2 and Tarantool.

Most Time-Consuming ScyllaDB Management Tasks

We know that ScyllaDB is widely powerful, but how easy it is to use? We asked ScyllaDB users what their most time-consuming management task was, and heard from 28% that Scylla Repair was the longest management task. Scylla Repair is a synchronization process that runs in the background to ensure all replicas eventually hold the same data. Users must run the nodetool repair command on a regular basis, as there is no way to automate repairs in the ScyllaDB open-source or ScyllaDB Enterprise versions, but you can setup a repair schedule through Scylla Manager.

ScyllaDB slow query analysis tied ScyllaDB backups and recoveries for second place at 14% each for the most time-consuming management task. It does not look like ScyllaDB currently has a query analyzer available to identify queries that need optimizing, but users can use their Slow Query Logging to see which queries have the longest response time. ScyllaDB backups are also unable to be automated through the open-source and enterprise versions, but they state that recurrent backups will be available in future editions of Scylla Manager. There is also no automated way to restore a ScyllaDB backup, as these must be performed manually in all versions.

10% of ScyllaDB users reported that adding, removing or replacing nodes was the most time-consuming task, coming in at a fourth place. These are manual processes that can take quite a bit of time, especially if you are dealing with large data size. Adding nodes is used to scale out a deployment while removing them scales your deployment down. Nodes must be replaced if they are down, or dead, though a cluster can still be available when more than one node is down.

Tied for fifth place at 7% was upgrades and troubleshooting. ScyllaDB Enterprise and open source both require extensive steps to upgrade a cluster. The recommended methods are through a rolling procedure so there is no downtime, but this is a manual process so the user must take one node down at a time, perform all of the upgrade steps, restart and validate the node before moving on to performing the same steps for the remaining nodes in the cluster. Time-consuming indeed, but fortunately not a daily task! Troubleshooting is, of course, a deep rabbit hole to dive into, but ScyllaDB Enterprise customers receive 24/7 mission-critical support, and open-source users have access to a plethora of resources, including documentation, mailing lists, Scylla University and a slack channel for user discussions.

Most Time-Consuming ScyllaDB Management Tasks Chart - Database Trends Report ScaleGrid

The remaining 21% of time-consuming tasks reported by ScyllaDB users include monitoring, migrations, provisioning, balancing shards, compaction and patching.

So, how do these results compare to your ScyllaDB deployments? Are you looking for a way to automate these time-consuming management tasks? While we support MySQL, PostgreSQL, Redis and MongoDB' Database today, we're always looking for feedback on which database to add support for next through our DBaaS plans. Let us know in the comments or on Twitter at @scalegridio if you are looking for an easier way to manage your ScyllaDB clusters in the cloud or on-premises!

The post ScyllaDB Trends
How Users Deploy The Real-Time Big Data Database
appeared first on Datafloq.

]]>
Google’s BERT changing the NLP Landscape https://datafloq.com/read/googles-bert-changing-nlp-landscape/ Tue, 26 Nov 2019 13:26:57 +0000 https://datafloq.com/read/googles-bert-changing-nlp-landscape/ We write a lot about open problems in Natural Language Processing. We complain a lot when working on NLP projects. We pick on inaccuracies and blatant errors of different models. […]

The post Google’s BERT changing the NLP Landscape appeared first on Datafloq.

]]>
We write a lot about open problems in Natural Language Processing. We complain a lot when working on NLP projects. We pick on inaccuracies and blatant errors of different models. But what we need to admit is that NLP has already changed and new models have solved the problems that may still linger in our memory. One of such drastic developments is the launch of Google's Bidirectional Encoder Representations from Transformers, or BERT model the model that is called the best NLP model ever based on its superior performance over a wide variety of tasks.

When Google researchers presented a deep bidirectional Transformer model that addresses 11 NLP tasks and surpassed even human performance in the challenging area of question answering, it was seen as a game-changer in NLP/NLU.

BERT model at a glance

  • BERT comes in two sizes: BERT BASE, comparable to the OpenAI Transformer and BERT LARGE the model which is responsible for all the striking results.
  • BERT is huge, with 24 Transformer blocks, 1024 hidden layers, and 340M parameters.
  • BERT is pre-trained on 40 epochs over a 3.3 billion word corpus, including BooksCorpus (800 million words) and English Wikipedia (2.5 billion words).
  • BERT runs on 16 TPU pods for training.
  • As input, BERT takes a sequence of words which keep flowing up the stack. Each layer applies self-attention, and passes its results through a feed-forward network, and then hands it off to the next encoder.
  • The output of each position is a vector of size called hidden_size (768 in BERT Base). This vector can be used as the input for a classifier you choose.
  • The fine-tuned model for different datasets improves the GLUE benchmark to 80.5 percent (7.7 percent absolute improvement), MultiNLI accuracy to 86.7 percent (4.6 percent absolute improvement), the SQuAD v1.1 question answering Test F1 to 93.2 (1.5 absolute improvement), and so on over a total of 11 language tasks.

Theories underneath

BERT builds on top of a number of clever ideas that have been bubbling up in the NLP community recently including but not limited to Semi-supervised Sequence Learning (by Andrew Dai and Quoc Le), Generative Pre-Training, ELMo (by Matthew Peters and researchers from AI2 and UW CSE), ULMFiT (by fast.ai founder Jeremy Howard and Sebastian Ruder), the OpenAI transformer (by OpenAI researchers Radford, Narasimhan, Salimans, and Sutskever), and the Transformer (by Vaswani et al). However, unlike previous models, BERT is the first deeply bidirectional, unsupervised language representation, pre-trained using only a plain text corpus.

AI NLP

Two Pillars of BERT

BERT builds on two key ideas that paved the way for many of the recent advances in NLP:

  • the transformer architecture, and
  • unsupervised pre-training.

Transformer Architecture

The Transformer is a sequence model that forgoes the sequential structure of RNN's for a fully attention-based approach. Transformers boast both training efficiency and superior performance in capturing long-distance dependencies compared to the recurrent neural network architecture that falls short on long sequences. What makes BERT different from OpenAI GPT (a left-to-right Transformer) and ELMo (a concatenation of independently trained left-to-right and right- to-left LSTM), is that the model's architecture is a deep bidirectional Transformer encoder.

A bidirectional encoder consists of two independent encoders: one encoding the normal sequence and the other the reversed sequence. The output and final states are concatenated or summed. The deep bidirectional encoder is an alternative bidirectional encoder where the outputs of every layer are summed (or concatenated) before feeding them to the next layer. However, it is not possible to train bidirectional models by simply conditioning each word on its previous and next words, since this would allow the word that's being predicted to indirectly see itself in a multi-layer model the problems that prevented researchers from introducing bidirectional encoders to their models. BERT's solution to overcome the barrier is to use the straightforward technique of masking out some of the words in the input and then condition each word bidirectionally to predict the masked words.

Unsupervised pre-training

It is virtually impossible to separate the two sides of BERT. Apart from being bidirectional, BERT is also pre-trained. A model architecture is first trained on one language modeling objective, and then fine-tuned for a supervised downstream task. The model's weights are learned in advance through two unsupervised tasks: masked language modeling (predicting a missing word given the left and right context in the Masked Language Model (MLM) method) and the binarized next sentence prediction (predicting whether one sentence follows another). Therefore, BERT doesn't need to be trained from scratch for each new task; rather, its weights are fine-tuned.

Why does this combination matter?

Aylien Research Scientist Sebastian Ruder says in his blog that pre-trained models may have the same wide-ranging impact on NLP as pretrained ImageNet models had on computer vision. However, pre-trained representations are not homogeneous: they can either be context-free or contextual, and contextual representations can further be unidirectional or bidirectional. While context-free models such as word2vec or GloVe generate a single word embedding representation for each word in the vocabulary, contextual models generate a representation of each word that is based on the other words in the sentence. The bidirectional approach in BERT represents each word using both its previous and next context starting from the very bottom of a deep neural network, making it deeply bidirectional.

The pre-trained model can then be fine-tuned on small-data NLP tasks like question answering and sentiment analysis, and significantly improve the accuracy compared to training from scratch.

Visualizing BERT

Deep-learning models in general are notoriously opaque, and various visualization tools have been developed to help make sense of them. To understand how BERT works, it is possible to visualize attention with the help of Tensor2Tensor.

<figure>Visualize BERT

The tool visualizes attention as lines connecting the position being updated (left) with the position being attended to (right). Colors identify the corresponding attention head(s), while line thickness reflects the attention score. At the top of the tool, the user can select the model layer, as well as one or more attention heads (by clicking on the color patches at the top, representing the 12 heads).

Open-source

Soon after the release of the paper describing the model, the team also open-sourced the code of the model, and made available for download versions of the model that were already pre-trained on massive datasets. These span BERT Base and BERT Large, as well as languages such as English, Chinese, and a multilingual model covering 102 languages trained on wikipedia. Thanks to this invaluable gift, anyone can now build a machine learning model involving language processing to use this powerhouse as a readily-available component saving time, energy, knowledge, and resources.

The best way to try out BERT directly is through the BERT FineTuning with Cloud TPUs notebook hosted on Google Colab. Besides, it is a good starting point to try Cloud TPUs.

Afterwards you can proceed to the BERT repo and the PyTorch implementation of BERT. On top of it, the AllenNLP library uses this implementation to allow using BERT embeddings with any model.

BERT in practice

BERT was one of our top choices in CALLv3 shared task (the text subtask of which we have actually won). The Spoken CALL Shared Task is an initiative to create an open challenge dataset for speech-enabled CALL(computer-assisted language learning) systems. It is based on data collected from a speech-enabled online tool that helps Swiss German teens practice skills in English conversation.The task is to label pairs as accept or reject , accepting responses which are grammatically and linguistically correct.

We used BERT embeddings to classify the students' phrases as correct or incorrect. More specifically, we used its multi_cased_L-12_H-768_A-12 model trained on Wikipedia and the BookCorpus.

From BERT, we obtained a 768-dimensional vector for each phrase from the dataset. We used German prompts translated using the Google Translate service and the corresponding English answers concatenated via ||| as inputs. This approach turned out to work well in our case. Used in combination with the nnlm model, BERT showed the second best result in our experiments. Besides, we did not perform finetuning because of the scarcity of the data set. However, we believe that with the sufficient amount of data, finetuning of BERT can yield even better results.

Our experiments reconfirmed that the BERT model is a powerful tool that can be used in such a sentence pair tasks as question answering and entailment.

Epilogue: the future is exciting

While we were writing this post, news came that the Facebook AI team released their code for the XLM/mBERT pretrained models that cover over 100 languages. All code is built on top of PyTorch and you can directly start playing around with the models with a provided ipython notebook. The new method called XLM, published in this year's paper, provides a technique to pretrain cross-lingual language models based on the popular technique of Transformers. The recent release, therefore, means you can now use pretrained models or train your own to perform machine translation and cross-lingual classification using the above languages and transfer it to low-resource languages, addressing the long-standing problem.

The post Google’s BERT changing the NLP Landscape appeared first on Datafloq.

]]>
Is Google’s Go All Set to take over Python for Analytics https://datafloq.com/read/is-google-go-all-set-take-over-python-analytics/ Tue, 22 Oct 2019 11:22:11 +0000 https://datafloq.com/read/is-google-go-all-set-take-over-python-analytics/ When Guido van Rossum developed Python in the 1980s, seldom did he know that it would become the world's most widely used programming language. More than 8 million developers today, […]

The post Is Google’s Go All Set to take over Python for Analytics appeared first on Datafloq.

]]>
When Guido van Rossum developed Python in the 1980s, seldom did he know that it would become the world's most widely used programming language. More than 8 million developers today, use Python as their primary development language. All thanks to its hassle-free code, an abundance of libraries and plethora of applications. Be it website development, machine learning or analytics, Python is being used everywhere and anywhere that people can think.

One of the most important implementations of Python is observed while analyzing data. Ever since the wave of digitization swept industries off their feet, the world has been blessed with plentiful data. Be it healthcare, business or any other industry. Developers are utilizing Python for data analytics, and it is proving to be of great importance.

There are more than just a few advantages of data analytics. On the one hand, it is helping people see the future with evidence from the past and the present. On the other hand, it is facilitating better decision making in all the processes. The point is, whatever people or industries want to accomplish with the data, Python is assisting in it. It is a tool that makes complex and tangled data appear sorted and clear. In other words, Python helps see data as information.

But as the world evolves, new programming languages keep on emerging. They are born out of the shortcomings or deficiencies of existing languages and help improve the performance of the system in one way or other. One such programming language is Google‘s Go.

Golang or Go is a language developed by Google, whose idea was realized in the year 2007. However, it was only in 2009 that the world saw its first release. It is relatively much fresh than Python, considering the timeline both were born. In spite of being known for ten years, why are people realizing Golang's importance now more than ever? Is it because the light around Python is dimming? Whatever be the reason, recent events have suggested that Google's Go has more than a few advantages over Python, especially when we talk of data analytics.

Google's Go was born out of the need for a language that was loosely based on the syntax of C, while eliminated the extraneous garbage of C++. As a result of this, the lead developers at Google Robert Greisemer, Rob Pike and Ken Thompson created Go with many features of modern languages. Having said this, developers can easily find object-oriented features such as operator overloading, pointer arithmetic and type inheritance along with others. Apart from this, Go also has a robust library with unmatched performance and speed.

Even though Python can do a lot of what Go does, it lacks in some of the critical aspects. When it comes to speed, dynamic typing, GIL, concurrency support etc. Python shows clear signs of limitations for analytics. Let's take a more in-depth look at what this comparison between Go and Python mean for analytics-

Performance

With large amounts of data on the table, it becomes fundamentally necessary to have an optimal performance during analysis. Think otherwise. The analysis would be a prolonged process when data are abundant, and the developed system takes massive amounts of time to derive meaning out of it. However, we are somehow able to do it with Python. It might not seem as slow, but when compared to Google's Go, it is easy to understand the difference.

Be it memory usage or time spent in a mathematical calculation, Golang is faster and less complicated than Python. However, anything you want you to build in Go can be called in Python.

Scalability

Python has also been in the news recently because Golang replaced it. The cloud-based software company Salesforce thought it was better to use Google's Go instead of Python for their data analytics program, Einstein Analytics. While this may come as a shock to Python lovers, the reality is that it's challenging to write long lines of code in Python. Moreover, today's applications demand scalability because of rapidly changing user needs.

If something is limiting the performance or growth of the system, it just turns out to be detrimental for a business. In the case of Salesforce, the company observed that Python miserably failed at multi-threading. And considering the future scope of scaling Einstein Analytics, continuing to use Python was a wrong choice.

In-Built Features

There is nothing that can deny the fact that every language is built with a purpose of its own. Having said this, programming languages find their applications in related niches and continue to extend their reach. The abundance of libraries in Python gives it an upper hand for data analytics. Be it Pandas for data manipulation and analysis, Matplotlib for plotting and visualization, StatsModels for statistical modelling, testing and analysis or Seaborn for statistical data visualization. There is no end to the libraries you can use for data analytics in Python development.

On the other hand, Google's Go is more focused on being a system language that finds its acceptance in the field of cloud computing and cluster computing. In data analytics, its usage can be extended because of either its performance or that fact that it offers concurrency.

In another instance, Python does not offer any memory management, while Go does.

Conclusion

Python is one of the best languages for basic programming. However, if you have to write large amounts of code, it is best to use Go. Python's lack of memory management, dynamic typing facility as well as the complexity of code with increasing length, make it an out-of-the place option for extensive system development. Golang, on the other hand, is inclusive of a lot of faster and more transparent in-built functions but lacks data analytics libraries that Python supports. In other words, when the field of data analytics demands concise and quick answers, nothing can beat Python. However, if one is looking for a more business-oriented solution, Go is a preferable option.

The post Is Google’s Go All Set to take over Python for Analytics appeared first on Datafloq.

]]>
Why Google Choose Dart & Not Any Other Language? https://datafloq.com/read/why-google-choose-dart-not-any-other-language/ Mon, 30 Sep 2019 09:26:14 +0000 https://datafloq.com/read/why-google-choose-dart-not-any-other-language/ Tech giant Google is known to constantly release new projects in the market; some of them are huge hits and then there are some that just go unnoticed by a […]

The post Why Google Choose Dart & Not Any Other Language? appeared first on Datafloq.

]]>
Tech giant Google is known to constantly release new projects in the market; some of them are huge hits and then there are some that just go unnoticed by a considerable margin. Speaking of huge hits, one of them is Flutter, Google's very own cross-platform app development framework that has been stealing the spotlight for the past few months.

Well, to be exact, the whole hybrid mobile app development started back in 2011 when Xamarin decided to release its own solution i.e., Xamarin SDK (Software Development Kit) with C# for building hybrid mobile applications. Soon after, other companies decided to follow suit and that's when Google launched the flutter framework to give some tough competition to the already existing market.

Now the question that comes into the minds of app developers is why does flutter use dart?' This question is already haunting a majority of mobile app developers who are skilled in programming languages other than Google's Dart but wish to make use of the Flutter framework as well.

In this article, we will be providing you the top reasons why Google decided to choose Dart and not any other programming language for its Flutter Framework.

Reasons Google Chose Dart for Flutter Framework

In this article, we will be providing you the top reasons why Google decided to choose Dart and not any other programming language for its Flutter Framework, which are as follows:

1. Preemptive Scheduling

There are many programming languages in the market such as Swift, Kotlin, Objective-C, and Java that employ preemption to switch between different threads like multiple concurrent execution threads.

Here, each and every thread is assigned a particular time period to execute a command. And in case, if the process overshoots, a context switch will then take care of that particular thread. With Dart programming language, there is no requirement for locks as threads in the dart language don't share any sort of memory that can lead to memory loss.

2. Garbage Collection

Garbage collection is another key factor in why Google decided to go with Dart for its Flutter framework instead of opting for any other programming language. The majority of languages require the use of locks so that they can access some shared resources, but that's not the case with Dart.

As dart supports garbage collection without involving a single lock, the mobile apps that are developed on its platform offer smooth functioning and run smoothly.

3. Single Layout Format

Unlike other programming languages, if a mobile app developer is using Dart language then there's no need to use a separate declarative layout or even any visual interface builder.

This is because the layout of the dart language is very simple and readable which makes the visualization process easy for the app developer. As a unified layout, the developer can easily perform changes in one place and thanks to all the advanced tools provided by Dart, the process becomes time efficient.

Conclusion

The dart programming language also offers well-ordered animations and smooth transitions that run at 60fps. Adapting to dart is comparatively easy, as it holds many similarities to the C and Java programming languages.

So, these were some of the primary reasons that make Dart the most suitable programming language for Google's flutter framework. Apart from the above-mentioned points, there are also other benefits that are provided by Dart such as AOT (Ahead of Time) compilation and Just in Time (JIT) compilation that ensure that the processing of the app is super fast

The post Why Google Choose Dart & Not Any Other Language? appeared first on Datafloq.

]]>
Safeguarding Your Data: Comparing the Top Cloud Backup Solutions https://datafloq.com/read/safeguarding-data-comparing-cloud-backup-solutions/ Tue, 02 Jul 2019 11:25:42 +0000 https://datafloq.com/read/safeguarding-data-comparing-cloud-backup-solutions/ Securing your database by moving it to cloud storage is not as much of a hassle as it may seem. Organizations use cloud storage services to back up sensitive data […]

The post Safeguarding Your Data: Comparing the Top Cloud Backup Solutions appeared first on Datafloq.

]]>
Securing your database by moving it to cloud storage is not as much of a hassle as it may seem. Organizations use cloud storage services to back up sensitive data and maximize their chances of recovery from an attack. Many companies are opting for cloud-based storage instead of storing data physically in a hard drive. The benefits of lower infrastructure cost, accessibility, and enhanced security make cloud backup very attractive for companies both small and large. Read on to learn why it is important to use cloud backup and see a comparison of the top three vendors to help you make your choice.

What Is Cloud Backup and Why Is It Important?

Cloud backup is a type of cloud storage solution that allows you to store and retrieve data from a distributed network. Companies use cloud backup solutions to automatically back up their data to a secure private or public cloud.

A common strategy involves sending a copy of the data over a public or private network to an off-site server, usually belonging to a third-party service provider. When choosing a database backup software solution, the most critical issue is its reliability.

Backing up data in the cloud allows you to restore the database with ease in the event of a disaster. According to the Boston Computing Network:

  • 93% of companies that lost their data center for longer than 10 days closed within one year of the disaster, with 60% closing at 6 months.

  • 34% of companies don't test their tape backups, with 77% of those finding tape back-up failures.

Cloud backup systems operate around a client software application that runs the backups on a schedule chosen by the client. Most services offer twenty-four- or twelve-hour backup. If the customer sets the backup schedule to half-daily backups, for instance, the system collects, compresses, encrypts and transfers data to the cloud server every twelve hours. A common practice is to perform a full backup at the beginning of the service and complement it with incremental backups afterward.

The main advantages of storing the backup in the cloud relate to the recovery time and the security the cloud vendors offer:

  • Recovery time cloud backups allow you to restore your database to function in a manner of minutes since it is already stored online.

  • Security cloud backup services often leverage from security solutions, including applications for Exchange and SQL Server

Comparing AWS, Azure and Google Cloud Backup on Different Domains

Since cloud backups offer a cost-effective alternative to a data center, they are increasing in popularity. There are several cloud vendors trying to meet the growing demands of companies. Here we will compare the performance of the top three, Amazon Web services (AWS), Microsoft Azure, and Google Cloud Backup.

Amazon Web Services (AWS)

The pioneer in cloud computing, AWS offers over 140 services and growing. Amazon Backup Platform supports aggregating and collating vast amounts of data.

The key features of AWS Backup include:

  • Centrally managed backups ‘enabling visibility through a central backup console, simplifying management. You can use the console to backup, restore and set backup retention policies.

  • Automated backup processes ‘the platform offers a fully managed, solution. AWS Backup provides automated backup schedules based on policies set by the customer. You can apply backup policies to your resources by tagging them,

  • Backup compliance ‘encrypting the data in transit and at rest. The system consolidates backup logs, facilitating compliance audits.

A key service that integrates with AWS backup is Amazon Simple Storage Service (S3), a cloud-based object storage service provides developers with a scalable, fast and simple storage infrastructure. Through Amazon Glacier, it offers backup features such as:

  • Cost-effective storage ‘pricing for long-term backup and data archiving.

  • Fast data retrieval ‘from large scale databases.

  • Secure and durable object storage ‘in a Deep Archive class.

Alternatively, Amazon Elastic Block Store (EBS) is used for storing persistent data from the Amazon Elastic Compute Cloud (EC2), a Virtual Machine (VM) service. It offers the following backup features:

  • Automatic replication of volumes ‘in the EBS Availability Zone, keeping the data safe on a file system whether the instance is active or not.

  • Cost-effective incremental backups ‘through EBS snapshots, which capture the recent changes on the device, minimizing duplication.

You can use the service adapted to your cloud environment, be it cloud-native, hybrid or on premises. First, in a cloud-native environment, AWS Backup integrates with other services such as Amazon EBS, Amazon DynamoDB, and AWS Storage Gateway, through the centralized console.

For hybrid environments, AWS Backup uses AWS Storage Gateway, which enables on-premises applications to use AWS cloud storage. You can apply the same backup policies to the AWS Cloud resources and to your on-premises data stored in the AWS Storage Gateway. Finally, for on-premises environments, you can back up your application data by storing it in AWS Storage Gateway volumes, which in turn are supported in the AWS Cloud.

Microsoft Azure

A cloud computing service offering enterprise-grade SPI (Software, Platform, and Infrastructure-as-a-Service) solutions. Azure provides the Microsoft background and Windows support for users, providing functionality for the entire production cycle.

Azure's focus on hybrid cloud networks allows companies to migrate their data to the cloud while keeping their on-premise hardware. The key functions of Azure Backup services are:

  • REST-based object storage

  • Data lake for big data

  • Queue storage for large volume workloads

  • Backup built into the platform

  • Backup support in one click

  • Data encryption for long periods

  • Multifactor authentication controls.

  • Disaster Recovery as a Service (DRaaS) built into the platform.

Azure Site Recovery allows for a simple and fast set up. You only need to replicate an Azure Virtual Machine into another Azure region. The service also provides automatic recovery and recovery testing without compromising running workloads.

Google Cloud

Google offers secure analytics and cloud storage. While not as big a competitor as the previous two, Google's platform has proven useful for small companies. It is especially attractive to startups because of its support of open source developments.

Google Cloud Platform (GCP) offers backup for objects, blobs, blocks, files and server content. Although there is not an official backup solution, the system offers a variety of services that can be customized for backup, for example:

  • Persistent Disk ‘provides snapshots for backup of block storage of virtual machines and containers.

  • Coldline ‘an option for cold storage of cloud archives. Data that is accessed maximum once per year.

  • Nearline ‘for warmer storage. That means, data that is accessed no more than once a month, useful for backup files.

Customers can integrate with third-party backup providers listed in the GCP partners page.

Which One Should You Choose?

All three cloud providers we covered in this article have strengths that help customers find the right solution for their organizations. Regarding their strengths, Microsoft Azure is user-friendly and being built in makes it easy to install. AWS Backup provides an automated centralized backup that can be attractive for companies with vast workloads. The main advantage of Google Cloud is to provide cost-effective backup features that can integrate with third-party backup solutions.

A good rule of thumb is to start by looking at what resources you use and your company requirements, since a good solution needs to fit your company like a glove. For example, a small startup looking for scalability and low-cost storage may consider Google Cloud Platform, while a larger company looking for automate large database workloads can look into AWS. For organizations working in a Microsoft Ecosystem, Azure could be the logical choice. What you choose will ultimately depend on your company goals and needs.

The post Safeguarding Your Data: Comparing the Top Cloud Backup Solutions appeared first on Datafloq.

]]>
Genderless Voice AI Could Provide Major Step in Combating Implicit Bias https://datafloq.com/read/genderless-voice-ai-could-provide-major-step-bias/ Wed, 01 May 2019 11:07:14 +0000 https://datafloq.com/read/genderless-voice-ai-could-provide-major-step-bias/ Implicit in any technical process or system are the biases of those writing the code that will govern the actions of that respective technical system or process. I'm not throwing […]

The post Genderless Voice AI Could Provide Major Step in Combating Implicit Bias appeared first on Datafloq.

]]>
Implicit in any technical process or system are the biases of those writing the code that will govern the actions of that respective technical system or process. I'm not throwing shade at developers in saying that, but rather highlighting that we all suffer from implicit biases whether known or not and those biases get baked into the software solutions we develop and deliver.

We've written about it before, but I think it bears repeating because there are some pretty fascinating solutions on deck aimed at combatting some common but probably unrecognized variants of this. Namely, the interface of the future, natural language processing (NLP), is confined to binary voice characteristics unnecessarily. Enter the genderless voice AI Q'.

Why do our vocal assistants' voices matter?

Quoting a great paragraph from Mark Wilson in Fast Company:

Voice assistants like Apple‘s Siri and Amazon‘s Alexa are women rather than men. You can change this in the settings, and choose a male speaker, of course, but the fact that the technology industry has chosen a woman to, by default, be our always-on-demand, personal assistant of choice, speaks volumes about our assumptions as a society: Women are expected to carry the psychic burden of schedules, birthdays, and phone numbers; they are the more caregiving sex, they should nurture and serve. Besides, who wants to ask a man for directions? He'll never pull over at a gas station if he's lost!

The gender of our assistants matters a great deal because it values signals at a societal level. It also reinforces gender bias that we each might have personally. Beyond that, and looking to the future, natural language processing is the interface of tomorrow. The better and better NLP AI gets, the more and more we'll use it to perform daily functions.

Pulling out a phone and typing a question into Google isn't necessarily natural or efficient, it's just what we've gotten used to for soliciting information or winning arguments with our friends. It'd be far more preferable to just say, Hey Siri, when was the Magna Carta drafted and your AirPods hear you, query the web, and return with an answer directly into your ears without you pulling out your phone or drafting your opposable thumbs into service.

So if voice assistants are the interfaces of tomorrow, the manner in which they're presented does indeed matter a great deal.

Genderless voice AI what does it sound like?

Enter Q', the genderless voice solution. According to its website, it was developed by a close collaboration between Copenhagen Pride, Virtue, Equal AI, Koalition Interactive & thirtysoundsgood. To develop a genderless-sounding voice AI, according to Fast Company:

Creators Emil Asmussen and Ryan Sherman from Virtue Nordic sampled several real voices from non-binary people, combined them digitally, and created one master voice that cruises between 145 Hz and 175 Hz, right in a sweet spot between male- and female-normative vocal ranges. To the developers, it was important that Q wasn't just designed as non-binary, but actually perceived by users as non-binary, too. So through development, the voice was tested on more than 4,600 people identifying as non-binary from Denmark, the U.K., and Venezuela, who rated the voice on a scale of 1 to 5 ‘1 being male and 5 being female. They kept tuning the voice with more feedback until it was regularly rated as being gender-neutral.

Here's Q's introductory video:

To be fair, the companies who own the voice assistant realm aren't sexist necessarily they're simply capitalists. According to Quartz, a former reporter for them, Leah Fessler

has pointed out, the tech companies‘ choices are driven by purely commercial motivations. Women's voices make more money, she wrote in a story exploring how bots are trained to respond to sexual harassment. Indeed, research has shown that people find voices that are perceived as female warm, and that men and women both have a preference for women's voices. This bias also turned up in Amazon and Microsoft's market research, the Wall Street Journal reports.

But, many of those tests only presented gendered options to its focus groups, and the sample sizes might not have been large enough to prove the point. The hope would be that given the option and a large enough sample size, enough users might choose Q as a preferable tonality, forcing Amazon, Google and Apple to offer some version of it in the future.

The post Genderless Voice AI Could Provide Major Step in Combating Implicit Bias appeared first on Datafloq.

]]>