You have commented 339 times on Rantburg.

Your Name
Your e-mail (optional)
Website (optional)
My Original Nic        Pic-a-Nic        Sorry. Comments have been closed on this article.
Bold Italic Underline Strike Bullet Blockquote Small Big Link Squish Foto Photo
Government
What Words and Phrases in Your Emails Might the FBI's New Software Be Searching For?
2013-01-11
Dear FBI - My wife has requested that you sent Special Agent Paul Krendler to interview me......
According to IT Pro, phrases like "nobody will find out," "cover up" and "off the books" are among the more than 3,000 words and phrases in new software developed by the accounting firm Ernst & Young with the FBI. In addition to these more obvious phrases, IT Pro reported that the software can identify change in tone, can be targeted to specific employees and also picks upon indications of nervousness, like "call my mobile."

Do you haff ze plans?
Wound my heart with a monotonous langor
Fred has a long moustache
Posted by:Uncle Phester

#19  Neither of those presents insuperable technical problems, airandee.
Posted by: lotp   2013-01-11 21:56  

#18  If you type text in spreadsheet and then cut and paste the text as a 'picture' into you email will the software be able to read it?

Note: there is a function called spedis that converts like sounding terms so intentionally mis spellings won't matter.
Posted by: Airandee   2013-01-11 18:50  

#17  This is an area in which I have some expertise. To answer the questions above:

There are many ways (and many off-the-shelf software packages, and many custom systems) that handle natural language data, including in text form (documents, web pages, tweets, email etc. as opposed to transcribed conversations, which tend to have a different linguistic structure). The state of the art goes well beyond just finding specific words or phrases, but the capabilities of specific systems outside of R&D shops differ greatly. IARPA is already several years into an R&D program on cross-language, cross-culture metaphor identification and interpretation, for instance. NIST has been running text retrieval, topic modeling, content extraction, machine translation etc. challenges / competitions for almost 20 years now. DARPA has a Deep Exploration and Filtering of Text effort that is expected to transition to operational use within DOD within a few years.

l33t sp34k would be fairly easy to deal with. Deleted text is just the regular text surrounded by formatting markers, so no problem there. Tweets / text conventions are already addressed here and there. Many packages handle various languages, including Arabic. Palantir primarily displays and links rather than interpreting the text itself - document information is imported into the tool, but tagging is done manually by analysts before the tool can display and cross-correlate based on those tags.
Posted by: lotp   2013-01-11 18:37  

#16  They can't find who is supporting al Qaeda who cares,
Posted by: Maggie Flomong2662   2013-01-11 18:17  

#15  Does anyone have eyes on all that uranium in Niger and Argentina or is it better for the poor people rot have mining jobs kinda like farmers growing poppy in Afghanistan.
Posted by: Maggie Flomong2662   2013-01-11 18:16  

#14  How does the software handle strikethroughs?
Posted by: Barbara   2013-01-11 18:05  

#13  Does the software understand Bavarian, too?
Posted by: European Conservative   2013-01-11 17:23  

#12  Mucky is the ultimate encrypter. His messages will give the software migraines.
Posted by: Alaska Paul   2013-01-11 16:56  

#11  In the Chiefs secondary nobody will cover up, and that the payroll vs talent return is off the books horrible. I just hope nobody will find out, especially thoz Ray-dahs.

That the Chiefs are going to be horrible for two more years is supposed to be a secret, jeesh!

What about Hot Carl, or is that Secret Service territory?
Posted by: swksvolFF   2013-01-11 14:35  

#10  "Allahu Akbar" will not be tagged.

"Deus Volt" will be tagged.
Posted by: charger   2013-01-11 12:44  

#9  Do they search messages written in Arabic?

Of course not silly - that would be RACIST.

Besides we all know the real terrorists are you 'white crackers' and returning veterans - and the Tea Party of course.

/SARC

As I recall the old 'emacs' text editor used to have a 'spook' command which would insert one of the terms the CIA was supposed to be monitoring emails for...
Posted by: CrazyFool   2013-01-11 11:39  

#8  1 w0nd3r h0w th3y h4nd13 l33t sp34k?
Posted by: Bright Pebbles   2013-01-11 10:44  

#7  Do they search messages written in Arabic?
Posted by: Rambler in Virginia   2013-01-11 09:23  

#6  Another 'popular' analysis methodology that may interest curious Rantburgers is Linguistic Inquiry and Word Count.
Posted by: Skidmark   2013-01-11 05:35  

#5  It's the phrases it won't be searching for, but should (and won't be added because of PC) that concern me the most.
Posted by: Bright Pebbles   2013-01-11 05:03  

#4  Palantir Technologies

(link fell off on #3, spelling was even worse)
Posted by: Besoeker   2013-01-11 02:23  

#3  It's called "tagging". Palantir Technolgies. The FBI is a client user.

Not only can it identify specific words or groups of words in massive volumes of data, by hitting the "merge" command, you can link everyone else who uses the term with a line and graphically depict the entire network of users.
Posted by: Besoeker   2013-01-11 02:21  

#2  I suspect that:

"A well regulated Militia, being necessary to the security of a free State, the right of the people to keep and bear Arms, shall not be infringed."

will probably flag a bit of attention as well.
Posted by: abu do you love   2013-01-11 01:42  

#1  Ummm, Bet "Fuck You FBI" is high on the list.
So, now the FBI will waste time on me?
Posted by: Redneck Jim   2013-01-11 00:46  

00:00