Gmail's new RetVec module enhances its text categorization algorithm to improve spam filtering | crm.buzz

Gmail’s new RetVec module enhances its text categorization algorithm to improve spam filtering

Gmail Retvec text vectorizer text classification and security

Gmail enhances its algorithm for text categorization and spam filtering, and imposes extra requirements on email marketers.

תוכן עניינים

Gmail’s spam filtering is now undergoing dramatic changes.

Fast, efficient, and accurate identification of harmful and abusive content, such as phishing attacks, fraudulent attempts, spam, or offensive comments in comments or posts, is an important pillar, and now Google is launching an efficient and innovative module to improve the detection of these attacks and has implemented it in Gmail as well.

Sophisticated and malicious users use various methods that make it very difficult for text models to identify accurately and efficiently, such as by using homoglyphs (replacing similar characters), invisible characters, and multiple keywords to trick the machine learning (ML) based defense mechanisms.

RetVec: A New Revolution in Text Categorization

The new model, called RETVec, which stands for Resilient & Efficient Text Vectorizer, and according to Google, is revolutionary in both accuracy and efficiency – 83% more efficient in terms of required processing power (measured in TPU – Tensor Processing Units) and it showed in energy efficiency, savings in valuable processing time and improved memory management.

The RETVec model is lightweight compared to other models (about 200,000 parameters), multilingual, and its level of accuracy is exceptional compared to other models they have implemented so far.

The model supports all languages and is submitted as an open source. Due to its great efficiency, it will run in applications that run on-devices, including mobile devices. It can be used for various applications for analyzing and classifying text.

According to Google, in its announcement of 11/29/2023, it tested the RETVec model in Gmail in recent years and is now operational.

Gmail blocks about 15 billion unwanted emails every day, and it says it detects about 99.9% of phishing, spam, and malware that reaches Gmail subscribers and prevents them from entering their inboxes.

A dramatic change in the effectiveness of Gmail spam filtering

The RETVec module improves spam detection by 38% and, just as significantly, improves false detection accuracy (19.4% improvement in false positive detection and 17.71% improvement in false negative detection).

Gamil RetVec text vectorizer
Source: Google's announcement

The importance of free text in email

Over the years, spammy words have become obsolete in B2C email filtering, yet Microsoft and other providers still rely on Bayes-based filtering, which measures “bad” words vs. “good” words to measure the “spamminess” of an email.

Gmail now uses the new RETVec mechanism, emphasizing the importance of using live text in the email’s body. Image-only emails that are so popular may be easier to produce but are inaccessible, they do not include a separate CTA for each link (only one click on the image itself), and do not allow searching the email inbox. This is a disadvantage.

Words that in the past were considered spammy, such as “free”, may even increase recipients’ engagement with the emails.

See more in an article on spammy words

See more in an article on accessible emails

Gmail toughens the requirements for senders

Gmail is filling another gap and toughening its requirements for marketers starting in February 2024.

Gmail finally wants to prevent marketers from sending emails from email platforms (ESPs) by using their private email address (emails in domains such as Gmail, Outlook, Yahoo, etc), as their sender email.

This requirement will benefit mailers and Gmail customers and will require marketeers to take responsibility and use their personal domain wisely.

From now on, Gmail will enforce a DMARC Policy in quarantine mode on its Gmail and googlemail domains, which will effectively no longer allow private Gmail addresses to be retired from mailing systems.

Gmail’s new requirements for marketeers apply to those who send over 5,000 emails a day from all email platforms to the gmail.com domain.

These are Gmail’s new requirements for marketers:

Domain verification:

Email senders must verify their sending domain using SPF or DKIM.

SPF authentication alone meets this Gmail requirement, but in a shared IP pool (the situation for many senders), there is no ability to associate the IP addresses that the SPF record approves with a specific domain.

Sometimes, there are hundreds or thousands of addresses. Therefore, it is essential to verify the domain with DKIM as well because the reputation of a domain is linked to a specific DKIM.

See more in an article on domain verification in a email platforms.

האזנה לפודקאסט

Easy unsubscription:

Allow subscribers to remove themselves from the list by easily unsubscribe (a code that the email platform attaches to the header of the message). This guideline is published in RFC 8058.

easy unsubscribe
easy unsubscribe

Zero tolerance for spam

Email senders will be required to meet a very low level of user-reported spam complaints. The range allowed by Gmail is between 0.1% and 0.3%. The level of reported spam from Gmail cannot be seen in the email platform but only in Google Postmasters Tools.

user reported spam google postmaster tools
Reports of spam from Gmail do not appear in the email platform! Click for explanation

:Publish a DMARC record

Email senders must publish a DMARC record, even if the policy is p=none.

I recommend setting up a DMARC Policy with an external deployment and monitoring tool and not using the settings provided by the various ESPs.

See more in an article on the DMARC protocol and how to implement it.

Podcast interviews with international email marketing experts

go to my new podcast emailgeeks.show for English only interviews.
רוצה להתייעץ איתי לגבי שיפור האימייל מרקטינג או עבירוּת המיילים שלך? אני מזמין אותך לפגישת ייעוץ ראשונית של 1/2 שעה, ללא עלות. book a 1/2 email deliverability discovery call.

Further reading

Google’s announcementofthe RETVec module

אודות הכותב

sella
סלע יֹפֶה

מלווה חברות, עסקים, סטרטאפים ומערכות דיוור בארץ ובעולם בנושא עבירוּת אימיילים (email deliverability) ואסטרטגיית אימייל מרקטינג כדי שאימיילים שעסקים שולחים יגיעו ל-Inbox ולא אל ה-Spam.

יוצר הבלוג והפודקאסט crm.buzz

Skip to content