[Dev] NeoScrypt GPU Miner - Public Beta Test
-
For those with old Radeon cards. This is my current OpenCL kernel: neoscrypt_vliw.cl
It is optimised to some extent for VLIW4/VLIW5. I get 17.5KH/s with it on a HD6970. That’s not much, but still better than 6KH/s with the default kernel.
How it use? Which miner?
-
For those with old Radeon cards. This is my current OpenCL kernel: neoscrypt_vliw.cl
It is optimised to some extent for VLIW4/VLIW5. I get 17.5KH/s with it on a HD6970. That’s not much, but still better than 6KH/s with the default kernel.
Congrats - not a bad chacha for 6xxx. Your salsa though… needs work.
-
Okay, here you go (NSFW): https://ottrbutt.com/miner/neoscryptwolf-11042014.png
As you can see, every card is around mid-70C or below, except that 270X with the dead fan. What you don’t see is that all of them have a decent sized voltage bump, except that 270X, which is slightly undervolted to compensate for its lack of cooling. So, even at this hashrate, the cards are running quite cool - too cool, as my room is rather warm (probably from all three desktops with high end GPUs…) - so I need to trade off some of the memory usage for extra computations. Compute power is likely still plentiful, unused because of memory access times.
Here’s my 7950 doing 150Kh/s. Fairly Safe For Work. (In Soviet Russia, You maul Bear! Then if you’re lucky, she mauls you back) I could use a suggestion for a good overclocking bios though. Been playing around boosting a few to 1.3v and I just ran across The Stilt’s mods the other day but haven’t had much luck with them either. As you can see, temperatures are absolutely no problem. Stays under 70 even when scrypt-hashing full load, full crank. Too many variations! Can’t spend days freezing and rebooting. It’s a Gigabyte, part 113-HD685ZNF63.SB, Hynix mem of course. My best so far is around 1260/1740. Would love to hit 1300/1800.
-
Gentlemen, is 3.7.7c the latest version? I don’t have linux unfortunately so can’t compile, and I probably don’t have the knowledge to do so :(. Is there a place I can go to find the latest version in compiled/windows format
-
i assume this is the newest version: http://cryptomining-blog.com/3715-new-cgminer-3-7-8-with-improved-neoscrypt-performance/
unfortunately it took a step backwards for nvidia cards (slower), but its meant for AMD so if that is better there, then good!
-
i assume this is the newest version: http://cryptomining-blog.com/3715-new-cgminer-3-7-8-with-improved-neoscrypt-performance/
unfortunately it took a step backwards for nvidia cards (slower), but its meant for AMD so if that is better there, then good!
The download link that this blog provides is hosted by itself… No sigs etc…
Not 100% sure if I would trust that download link.
Wolf0 has a nice windows compile right here signed and all.
Okay, done. I’m pretty sure it works, but haven’t tested on a Windows installation. This is a zip of my kernel, slighly modified for SGMiner, as well as all the other kernels included on the github’s develop branch, and a Win64 binary. Static compile, no DLLs, just like my standard SGMiner builds on Litecointalk. Also GPG signed, like my standard builds. Someone please test for me and ensure it works.
https://ottrbutt.com/sgminer/neoscrypt/sgminer5-neoscrypt-11-02-2014.zip
And of course, GPG sigs for those that check them (you should be): https://ottrbutt.com/sgminer/neoscrypt/sgminer5-neoscrypt-11-02-2014.zip.sig
-
Congrats - not a bad chacha for 6xxx. Your salsa though… needs work.
It’s scalar now. I’m not impressed by one found in old Scrypt kernels. Can write a better one probably. It isn’t a bottleneck anyway for these cards. When I replaced ChaCha with this one, it went from something like 12KH/s to 14KH/s. Loop unrolling delivered more alone.
BTW, it runs reasonably good with old AMD drivers and OpenCL compilers. HD5870 on Windows XP with 12.4 drivers went from 2.5KH/s to 10KH/s. Hell yeah, a 4x increase. Don’t try to use this kernel for NVIDIA. It fails to compile the vectorised ChaCha code.
How it use? Which miner?
Rename and put into any miner with the NeoScrypt support. Tested on cgminer v3.7.7, works fine.
-
For those with old Radeon cards. This is my current OpenCL kernel: neoscrypt_vliw.cl
It is optimised to some extent for VLIW4/VLIW5. I get 17.5KH/s with it on a HD6970. That’s not much, but still better than 6KH/s with the default kernel.
Working great here for me with a couple 6950 unlocked to 6970. Went from 5kh/s to 16.5kh/s with
use of only -12 -w 64 -g 2 also works with 3.7.7c and 3.7.8 all I did was backup neoscrypt140909.cl then
delete it and remane your file to neoscrypt140909.cl to replace it, then backed up then deleted all .bin files
letting it make new .bins and bam I was off to the races. :)
Thank you!
Now I don’t mind running them, before I wouldn’t even use them to mine with.
Catalyst Version 13.12
{ "pools" : [ { "url" : "http://us.mine-ftc.co.uk:19327", "user" : "xxxxxxxxxxxxy", "pass" : "x" } ] , "intensity" : "12,12", "vectors" : "1,1", "worksize" : "64,64", "gpu-engine" : "825-825,825-825", "gpu-fan" : "0-95,0-80", "gpu-memclock" : "1300,1300", "gpu-memdiff" : "0,0", "gpu-powertune" : "0,0", "gpu-vddc" : "0.000,0.000", "temp-cutoff" : "90,90", "temp-overheat" : "85,85", "temp-target" : "70,70", "api-mcast-port" : "4028", "api-port" : "4028", "expiry" : "1", "failover-only" : true, "gpu-dyninterval" : "7", "gpu-platform" : "0", "gpu-threads" : "2", "log" : "5", "neoscrypt" : true, "no-pool-disable" : true, "no-submit-stale" : true, "queue" : "0", "scan-time" : "1", "temp-hysteresis" : "3", "shares" : "0", "kernel-path" : "/usr/local/bin", "device" : "0-1" }
-
I am right that I can not do anything with my GeForce 7600GS?
-
For those with old Radeon cards. This is my current OpenCL kernel: neoscrypt_vliw.cl
It is optimised to some extent for VLIW4/VLIW5. I get 17.5KH/s with it on a HD6970. That’s not much, but still better than 6KH/s with the default kernel.
i think there is some error…
/* NeoScrypt core engine:
* N = 128, r = 2, p = 1, salt = password */
__attribute__((reqd_work_group_size(WORKGROUPSIZE, 1, 1)))it should be? ???
/* NeoScrypt core engine:
* N = 128, r = 2, p = 1, salt = password */
__attribute__((reqd_work_group_size(WORKSIZE, 1, 1))) -
i think there is some error…
/* NeoScrypt core engine:
* N = 128, r = 2, p = 1, salt = password */
__attribute__((reqd_work_group_size(WORKGROUPSIZE, 1, 1)))it should be? ???
/* NeoScrypt core engine:
* N = 128, r = 2, p = 1, salt = password */
__attribute__((reqd_work_group_size(WORKSIZE, 1, 1)))There is no error, but it doesn’t really matter.
[2014-11-07 18:47:02] Started cgminer 3.7.8
[2014-11-07 18:47:07] Probing for an alive pool
[2014-11-07 18:47:08] Error -11: Building Program (clBuildProgram)
[2014-11-07 18:47:08] “/tmp/OCLxlhCZF.cl”, line 665: error: identifier “WORKSIZE” is undefined
__attribute__((reqd_work_group_size(WORKSIZE, 1, 1)))
^1 error detected in the compilation of “/tmp/OCLxlhCZF.cl”.
Internal error: clc comp
-
i think there is some error…
/* NeoScrypt core engine:
* N = 128, r = 2, p = 1, salt = password */
__attribute__((reqd_work_group_size(WORKGROUPSIZE, 1, 1)))it should be? ???
/* NeoScrypt core engine:
* N = 128, r = 2, p = 1, salt = password */
__attribute__((reqd_work_group_size(WORKSIZE, 1, 1)))WORKSIZE is only for the newer SGMiner.
-
thats what i used…
-
It’s scalar now. I’m not impressed by one found in old Scrypt kernels. Can write a better one probably. It isn’t a bottleneck anyway for these cards. When I replaced ChaCha with this one, it went from something like 12KH/s to 14KH/s. Loop unrolling delivered more alone.
BTW, it runs reasonably good with old AMD drivers and OpenCL compilers. HD5870 on Windows XP with 12.4 drivers went from 2.5KH/s to 10KH/s. Hell yeah, a 4x increase. Don’t try to use this kernel for NVIDIA. It fails to compile the vectorised ChaCha code.
Rename and put into any miner with the NeoScrypt support. Tested on cgminer v3.7.7, works fine.
I’ve never worked on 6xxx, but isn’t shuffle cheap? One shuffle for the permutation, keep it through all the salsa rounds + XOR ops, one shuffle to fix it. Seems like it’d be worth it - of course, unrolling will deliver a lot, probably.
-
It seems my 280X and 290X do like parallel chacha - I just needed to tweak it a bit more. Code size seems about the same, though, small speedup on execution time, I think.
-
OH MY GOD. I’ve been staring at this code for ages, and it only JUST NOW occurred to me that SMix() is parallelizable. Not the internals of SMix, of course, but the two calls to it…
-
i really hope wolf you are not doing this just for your self…
you would gain more if you release your work, im sure people here would like to collect some bounty for your work to you release latest kernels.
these people here are fair people.
-
i really hope wolf you are not doing this just for your self…
you would gain more if you release your work, im sure people here would like to collect some bounty for your work to you release latest kernels.
these people here are fair people.
I’m doing this because it’s interesting. Also, SMix being parallelizable hardly matters unless you split it into 3 kernels, which is doable, but idk what the overhead on the kernel launches would be…
-
Quick question, in my bat file, how do I specify different values for say -i or -w so that my two cards (which are different) have different settings?
-i 14,15?
-w 48, 72?
Kind regards,
T4
-
Quick question, in my bat file, how do I specify different values for say -i or -w so that my two cards (which are different) have different settings?
-i 14,15?
-w 48, 72?
Kind regards,
T4
“intensity” : “18,18,18”,
“worksize” : “256,128,256”,specify in your .conf