*** John Goerzen [2020-12-30 09:37]: >Interesting! I just happened across an apparent memory leak in nncp-exec in >which it was using over 1GB of RAM. To be sure, I may have been piping >several GB of data to it. But I would expect it to stream out the data as it >is read from stdin, hopefully? I wonder if this is related? My intention for uncompressed execs is not related. I just see that someone can (obviously) use nncp-exec for tasks other than sending (highly compressible) email/news and compression can harm and be useless in many cases. nncp-exec really stores all compressed data in the memory before writing to the disk. I can replace it with writing to the temporary file, as "nncp-file - dst:..." already does now. Problem with streaming is that you do not know exact file size in advance, that is written in the beginning of the first encrypted block. So currently, for nncp-file reading from stdin, I store all read data in the temporary file and then read it, knowing its size. Of course ephemeral symmetric encryption is done, so even if computer will suddenly shutdown and that file stays there -- noone can decrypt it. I am not sure, (it is too late to think clear :-)), but I think that temporary file can be replaced with the following scheme: * we create an ordinary encrypted packet in place, streaming all the data inside but. But leaving its first encrypted 128KiB block blank (filled with zeros)! Its payload is memorized in RAM * after all the data is read and written, we fill size field in that block in the RAM and encrypt it * then seek to the zero-filled first block and overwrite with its encrypted fully ready contents from the RAM All encrypted blocks are independent from each other (http://www.nncpgo.org/Encrypted.html), so we can safely do that. Zero filled block is also will skip real block allocation on ZFS with any kind of compression enabled. But that single 128KiB block won't be sequentially placed to other ones. Not a big deal I think. But here is another problem: the whole packet is hashed from beginning to the end. And changing that single block will require the entire hashing calculation to be done again. It is good that we get rid of temporary file at all, but we still need to sequentially write all the data from stdin, make seek/overwrite and then sequentially read all the data again for its checksumming. We can avoid it by using hash trees, Merkle trees. Changing of that single block will lead to recalculation of only one tree path, but not reading the data from the disk again. Moreover it gives ability to parallelize hash calculations (however I doubt it is bottleneck anywhere in practice, but who knows!). So... currently I do not see what can go wrong with that scheme. No temporary files (for "nncp-file -" and nncp-exec) and ability of parallelizable hashing. And of course it will "fix" (actually it is not a bug, just that is how it worked all the time :-)) the issue with memory consumption with nncp-exec. But it will make current packets backward incompatible with that scheme, because of different checksum algorithm. >On the receiving side, I'm getting "data exceeds max slice limit" for these >large ones. >Is there a file size limit to exec and (un-chunked) file? Should not be any. Currently do not know where it can raise that error. Will look at it later, probably when the "new scheme" will be implemented. >Thanks again for NNCP. I am really appreciating it! Glad to hear that! -- Sergey Matveev (http://www.stargrave.org/) OpenPGP: CF60 E89A 5923 1E76 E263 6422 AE1A 8109 E498 57EF