[Dev] NeoScrypt GPU Miner - Public Beta Test
-
my amd driver crashes when using -w 24 -I 12…
radeon 7990, driver 13.something, although getting nearly 140 khs
-
Currently getting about 80 kh/s with 3.7.6b, cl file from 3.7.5, *.bin generated with drivers 13.12. And currently running on 14.9
Sapphire 280x with settings
cgminer.exe --neoscrypt -I 13 -w 32 --thread-concurrency 8192
definitely something wrong with the *.cl file from 3.7.6b, no accepted shares unless i lower intensity to 6 but then no performance.
link to my files working on 280x
https://mega.co.nz/#!YFli0LQR!7RjxmzHq1b0Vo0Afr107gNq51rIx1GXSxQOTYfw9_Z8
-
nope… the miner freezes here after a while :(
it starts off well and after five minutes this is what i get
or
what should i do?
-
nope… the miner freezes here after a while :(
it starts off well and after five minutes this is what i get
or
what should i do?
I’ve been having the same problems with freezing. I use the -T option and it keeps running with -w 256 -I 11 --gpu-threads 2. We’ve been discussing options here https://github.com/vehre/neo-gpuminer/issues/3
-
thanks! Looks good for now… Lets see for how long it goes :)
148 khs, WU 10…
EDIT: been working stable for a couple of hours now! I am not entirely sure but I think the p2pool…de is paying extra pxc for finding a block :)
-
-T -w 48 -I 13 works best for me so far… i wish I knew that -T option months ago :D
-
as i see this is only change between them???:
- (get_global_id(0)% CONCURRENT_THREADS)];
- (get_global_id(0)% WORKGROUPSIZE)];
Correct, this is the only change direct change in the kernel. May be I have missed something in the miner corelating with this.
Advice: when you use config-file for running the miner, then please remove the “thread-concurrency” : N from the config-file, if N is smaller then the worksize!
-
(opt_neoscrypt|| opt_scrypt)? 84: sizeof(work->data), true))) {
have no idea how it compares to setting of my miner but i get most stable and fast hash with value -w 84
That line of code has certainly nothing to do with the worksize. The line you have copied there, is taking care about correct communication when solo mining.
-
So just to wrap it up:
-w should be the preferred worksize of the GPU. This is usually a value evenly divideable by 64 and I haven’t found a GPU yet, where this value is beneath 256. That is why cgminer chooses 256 as the default for worksize.
The HW errors occuring with 3.7.6x are most likely due to me failing to ensure, that the thread-concurrency had a value equal to or greater than the worksize. I am currently working on a version where I get rid of the thread-concurrency completely and make use of the worksize only. Thread concurrency is of no use in neoscrypt and makes sense in scrypt only (at least to me).
My current setting on an Nvidia Geforce 218 is:
-w 512
The thread-concurrency is implicitly set to 512 currently. My intensity is set to default (dynamic).
I don’t use configuration files, but only command-line settings. When you use configuration files, make a backup and set worksize back to 256 or 512 depending on what your gpu prefers (removing the line completely makes cgminer select the devices preferred value). Next make sure thread-concurrency is set to worksize or a greater value (again removing the line, make cgminer use a reasonable default value). Setting thread concurrency to a value significantly greater than the worksize wastes memory only.
-
Considering cgminer is dead for GPU mining on newer versions, will this be added to sgminer instead in the future? I think having a common ground like that would help everyone instead of many different forks.
-
Considering cgminer is dead for GPU mining on newer versions, will this be added to sgminer instead in the future? I think having a common ground like that would help everyone instead of many different forks.
Would also love to see NeoScrypt ported to sgminer. Sgminer is much less trouble to setup, less failure.
-
Hello Everyone,
I’m back. O0
After 4 day and nights, finally got my neoscrypt code being optimized successfully by lovely opencl compiler.
The current relsult is: ScrachReg reduced to 224 and the overall hash rate for R9 290 is 160-170K/s. :)
With 5 R9290, I got around 800-830k/s locally and 780-800k/s on PXC.theblocksfactory.
link: http://i58.tinypic.com/noxd9h.jpg
My rig:
5 ASUS R9 290
Win8.1
GPU: default core and memory frequency.
AppSDK: 2.9.1
Crystal Driver: 14.4
Plus: Coding opencl is really nightmare: Comment one line or add one useless line will cause the result 100% different.
Sorry for my national holiday, but the result is exciting.
Love Neoscrypt, hate opencl code but enjoying the fun.
Ralph
-
I can not express how greatfull I am for everyones work here!
This is monumental. The newsletter will go out ASAP, the thread is primed…
BRING ON THE STRESS TESTING!!! sorry for the caps but this had to be yelled.
-
Looks like ill need to do a build on my linux boxes
-
Would also love to see NeoScrypt ported to sgminer. Sgminer is much less trouble to setup, less failure.
The issue is not porting the cl-kernel to sgminer, but the changes neccessary to cope with the changes in the network protocol.
-
Plus: Coding opencl is really nightmare: Comment one line or add one useless line will cause the result 100% different.
Sorry for my national holiday, but the result is exciting.
I totally agree, coding OpenCL is a nightmare. Unfortunately I can explain the theory, why this happens: The SIMD modell is playing against us. Adding or deleting instructions, that threads have to skip or used to sync execution heavily plays into overall performance.
Let’s look at the code at the end of fastkdf():
if (a >= output_len)
// copy
else
// merge
Now “a” depends on the input data, the chances that for a bunch of threads trying to execute this conditional on multiple (different) data - remember every thread has its own distinct data - makes some threads execute the then part, while others do the merge part. SIMD now dictates that all threads execute the same instruction or skip it. Or with other words: All threads execute both parts of the conditional. Now, OpenCL is able to switch off some of the threads, i.e., the threads sees the instruction, but does not execute it. It idles. The compiler tries to handle this, but is not always successfull.
So if just one thread needs to execute the other part of the contidional than all other threads, then nevertheless all threads will step through all instructions in both the then and the else branch.
So far just for the background. :)
-
I had tested 3.7.7b , it work normal.
-
Text file: windows.build.txt that comes with cgminer is a very description. Try that first, please.
-
i would like to have some quick info how to compile miner in windows…
i have mingw with all the wallet deps installed, but i have had never time to look how its done with miner
Text file: windows.build.txt that comes with cgminer is a very description. Try that first, please.
I spent about an hour the other night trying to get mingw to compile 3.7.6. The instructions in the windows.build.txt will definitely get you started - however, they are missing a lot of info. Most of the errors you’ll run into simply require you to google it run the appropriate “mingw-get install XXXXX” command. When you work through the windows build directions, make sure that you pay special attention when downloading libraries and moving files into specific locations for the build. You’ll save yourself a lot of headache if you pay close attention and don’t miss anything. Toward the end of the build, you’ll get an error from “make” regarding a missing jansson.h file. Take a look at the bfgminer windows build directions on github and find the “libjansson” section. It will walk you through manually building this and get rid of that error. Ignore the fact that it wants you to re-download it, these files are already located within the cgminer build directory inside of the “compat” directory.
Cheers.
-
folks, can we get a simple program that is kinda like a “bin creator” where you have a GUI that is more friendly for configuring the gpu miner to make noobs life a bit easier and get them hooked faster?