Re: spam
-
I am marking spam accounts to build up filter conditions to mass delete them. I believe it is due diligence to back up those spam accounts and also monitor them for me being over zealous, before they are deleted. So can you stop deleting them for the moment?
All the spam members will go in the mass clearout we are planning.
Plus
In deleting the acounts, you are now playing into the spammers hands by making loads of accounts/ emails the spammers can reuse, and other interesting things the investigation of the spam accounts is finding out.
If you want to help, start tagging past spam member accounts. That will help us identify spammers and better filter conditions. Or help positively tag real members…
How to check members aren’t spam
Open members,
Click on More Search Options Button
set a filter to “join date” after (on march 2015) 03/01/2015
Go to the bottom of the filter
Check Show all - By Date - Ascending (or Descending depending if you doin old ones or looking for todays spam.
click on the Filter.
Click each new member,
check for spam, by looking for the usual signs i.e spam in the user info, spam name, part of a set of spam equally timed etc…
report spam members,
in the report fied type : spammer
and flag as spam, using the flag as spam button.
How else to help with the member review.
Positive human member identification :
If you find a real member with real posts (there are lots)
I PM them and welcome them to FTC,
If the member appears to be human, I then give them one star.
I also find a good post they made and tag that post up.
i.e It easier to identify the 5000? real members than the 100,000 spam accounts…
-
Found this very interesting / usefull site
http://cleantalk.org/spambots-check?packet=840a564
How to watch the spambots.
Click on Forums at the top of the page
Go to the bottom of the page : where it say 19 users online,
most of the guests are what the Bots are watching (old users etc) : Click of see list of users online :
19 users are online (in the past 15 minutes)
1 members, 18 guests, 0 anonymous users (See full list)
Admin can see the IP which can then be tested in whois or as above…
Records: 1
# Record Status 1 204.12.216.18 SPAMBOT
-
You can help remove spam : Tag REAL new members :
We can easily go back through new members, to highlight the new members that took the effort to post. Adding a star for the user passing the New User Post test and upping the Post Kudos by the bottom right green button of the new members post of real members. They can then be used as filter conditions, do not delete if > “0 star”
This will positively identify real members and ensure they don’t get deleted by accident when the spam accounts are removed.
-
First newly identified Spam Address (not on blacklist [http://cleantalk.org/spambots-check):][0]
39.46.62.208
-
You can help remove spam : Tag REAL new members :
We can easily go back through new members, to highlight the new members that took the effort to post. Adding a star for the user passing the New User Post test and upping the Post Kudos by the bottom right green button of the new members post of real members. They can then be used as filter conditions, do not delete if > “0 star”
This will positively identify real members and ensure they don’t get deleted by accident when the spam accounts are removed.
Some of the Bots are actually posting…
So a content check, what is posted is required
-
The Coin Bot has been trained up from scratch, all responses are learnt from Chats or Chat logs.
The Bot construction may be thought of more like a novel, Coin Bot is currently a brief synopsis (1 years work).
The responses can be trained and complex behaviour “Trained in” in the Bot framework I am using.
The idea to prevent spam would be to have a Chat Bot on the FTC “User Registration Page”, It would be programmed with one greeting. I have paid for that for a year and there is a code to easily put in the html.
“Please type your user name in the chat box for us, thank you, Coin Bot”
We could the then set a 30 sec time out (say) for a new member to type a response to Coin Bot, and an email message to admin if the time or name is not correct. i.e. its a bot. The extra work will distract the spammer (if he has to do it by hand) also extra details of the login process will then be known, to further enhance the anti spam measure. We don’t have to pay external spam control, first site to start to use AI security measures.
Initially it could just be an extra step, unless speak is pressed no registration.
The main advantage I see is, I can use my Bot, the Bot company are interested in the idea and may integrate the idea to give the bots email / skill based on a trainable trigger.
I have also discuses with them my idea to work in a meta level above the characters in human language. For instance a system could allocate a number to a word, say based on the position of letters. This would automatically correct one of the problems with the bot, words beginning with capitals or ending in s are different words, whereas in the number meta level, even slightly miss spelt words would be close on the position of letters meta level.
All images and characters have to be normalised to numerical data to be processed, in the case of using ASCI numbers to replace letters, these are not optimised to the usage case as an evolved / flexible system could be using machine learning and optimisation techniques.
Also, a system could evolve such that it can test various options for converting words to numbers, such that it searches an finds word the quickest. It could do this by allocating numbers for each letter such that there is the optimal numerical distance between close words, and that alternate (spellings) are at particular “angles” on that imaginary plane. This would also mean the system self optimised to spell, search any new language.
The word system could then be applied to a meta layer of the sentence, thus allowing the bot to learn language “skills” from scratch.
At the moment using standard filter techniques there is no way to identify a Spam account, although many are “plain to a human”. Take the case where you identify a Bot naming algorithm - one name + “d” + a second name + 3 figure number. Or where the Bot is named – Random 5 characters + name + 2 figure number.
A suitably designed and trained Bot would find those naming formulas easily if search on the number meta level, or even a learned skills from a text interface. Also again, the bot is not meant to be intelligent, it reports to a human like a filter or simple chron system. I would recoment “Human out of 10” send me an email if its < 3.
The Bots, in my opinion, are not there to imitate humans, they are there to do the donkey work intellectual analysis to give humans the knowledge to make sensible decisions. They are also capable of taking over the “look up table” of many intellectual pursuits, e.g programming languages or coding principles, and allow the common person to do software development.
For instance we will soon have AI bots learning to program by searching for bugs on Github? If googles AIs were redirected from spying they could easily be doing that now. Who wants to go through every line of code on Github to see if anyone made a similar mistake to this issue that has been corrected?
or who wants to go through every user on this forum and guess / estimate if they are a spam bot?
My (proposed) bots would do that all day long, no complaints…
Also we need to rush before the spam bots get AI.
coin bot: I am the Coin Bot
You: Wellenreiter said he talked to you
coin bot: I know about Litecoin
You: That’s what he said you said
coin bot: There ain’t no such thing as a free lunch.
You: what about free software?
coin bot: Feathercoin bypasses the old banking system by using peer-to-peer technology.
You: what did he say?
coin bot: He said “frostfire: Sideways-Sanae was here yesterday”
You: Wellenreiter was here yesterday
coin bot: did you talk to him about Sideways-Sanae?
You: You talked to him about FTC
coin bot: I know about Litecoin
You: what do you know?
coin bot: We economists don’t know much, but we do know how to create a shortage.
I have managed to get a basic Machine learning framework working to identify the spam accounts, although it is slow (it can identify one Bot Account a day) and I have cheated in that I haven’t evolved the parsing routine, and I’ve “invented” the word / letter numbering schemes from possibilities I envisaged the Bot could evolve and the data we have collected so far. (Also some manual dat transfer needs automating)
I’m splitting the string into words - then having the bot learn to link closeness to certain numbers. I can add / train up the the word order. I’ve calculated the word order for the spam accounts, certain words, positioning and orders are common in our current spam accounts. So I calculated various numbers from the data, and look for those (fuzzily) .
Even I’m astonished how quickly and easily AI techniques can align to simple solution to a complex problem. I’m only running it on a CPU, so should power up somewhat as is without optimisation…
-
Member Positive identification.
The spam identification will now trickle on till the “move”. I am now concentrating on upvoting “real members”
How to Vote up real members :
click on Members (top forum menu)
click on search options
click last post after in put date (MM/DD/YYYY) 03/01/2014 (I am doing post since the forum was hacked)
Go to the bottom : adjust the order, I am doing ascending join date, you could do posts or name
Press apply filter
Right click on a name open in a new window :
click on Posts : “Click on a Post” : Review the post, vote it up (if happy) green buttom right
Give the member appropriate stars, I give 5 for > 5 good posts (non spam)
We can then use that data to make filters that ensure real members that have posted are not deleted.
-
Well, this platform is really bad.
Not many other forums are so plagued with spams as our
(but yet it is hard to judge how much moderators are filtering)
Thanks for working on this Wrapper
-
I want to leave all spam accounts. Deleting them plays into the spammers hand. The captcha seems like a protection racket.
The spammers are being marked as spam by my Bot and indexed as such on Google, Bing and Yahoo. I just get a list in the morning and tick any “real members”
I am very surprised the spammers think continuing to spam us is a good idea? … happy days …
-
can your script also delete the posts of marked spammers?
if a real member is flagged, we can manually undelete his posts
-
No, my (simple) macro works on the web page, not at admin level. Another reason for not just deleting them straight away is, trying not to delete real posts or members by accident.
I was hoping the posts of spammers might not show up, once they are tagged? They disappear from the members list…
-
I found this site very good, open source, interesting and anti spam.
-
Looks like your winning the fight there wrapper, only 1 spam post to great me this morning!
-
Rough estimate, spam members reduced to 2/3. And that delayed, slightly while “they” worked around it.
There at least 3 types of spam bot being used and each has a few IP ranges. They (spammer) also probably read this so rather not say what we’re trying public yet.
-
Erm…some of us delete spam post occasionally…just for fun :D
Cheers for any help you can give Mog :)
Hopefully some of the new anti-spam services we have lined up once we switch software should help us out
-
Goooood Morniiiing VietSpam !
and Iawgom !
1st full day : SpamMembers 1/10 SpamPost 1/5
-
good to read that :)
less work to delete that stuff
-
Also looks like the measures added as part of the forum switch are starting to work…
11 New users in 24hrs, some of which maybe real! but this number used to be 2x or 3x that amount
33 Posts blocked from even making it onto the foumI’m sure we won’t be winning for long, but it looks promising so far
UM
-
I can tell which members are real, not many so far. I note that NodeBB has moderator final confirm before a member is allowed to first post, I suggest we go with that. The bots don’t respond to PMs…