Bulk Download - IOException: The connection was terminated before a greeting could be read.

Jan 23, 2015 at 8:20 PM
Edited Jan 23, 2015 at 8:35 PM
Great package.

I'm doing a bulk download a large number of files (mainly pdfs but some zips and text files as well), and I encounter an unusual problem some time into the download.

I am using an asynchronous breadth-first traversal function in order to crawl through an FTP site and download all the files in each leaf node. It explores around 3000 folders and ends up queueing around 3168 files asynchronously using the following code
            Logger.Info("Download started");
            Stream s = await Task<Stream>.Factory
                .FromAsync(Client.BeginOpenRead, Client.EndOpenRead, path, Client)
The stream is then passed to a function which writes it to a file and logs "Wrote file". The log file can be summed up like this:
First ~3000 lines - Download started
Next ~200 lines - Interleaved "Download Started" and "Wrote file" messages
Next ~1000 lines - "Wrote File" messages
The problem is that after this point, I start getting this exception:
System.IO.IOException was caught
  Message=The connection was terminated before a greeting could be read.
    Server stack trace: 
       at System.Net.FtpClient.FtpClient.Connect()
       at System.Net.FtpClient.FtpClient.OpenRead(String path, FtpDataType type, Int64 restart)
       at System.Runtime.Remoting.Messaging.StackBuilderSink._PrivateProcessMessage(IntPtr md, Object[] args, Object server, Object[]& outArgs)
       at System.Runtime.Remoting.Messaging.StackBuilderSink.AsyncProcessMessage(IMessage msg, IMessageSink replySink)
    Exception rethrown at [0]: 
       at System.Runtime.Remoting.Proxies.RealProxy.EndInvokeHelper(Message reqMsg, Boolean bProxyCase)
       at System.Runtime.Remoting.Proxies.RemotingProxy.Invoke(Object NotUsed, MessageData& msgData)
       at System.Net.FtpClient.FtpClient.AsyncOpenRead.EndInvoke(IAsyncResult result)
       at System.Net.FtpClient.FtpClient.EndOpenRead(IAsyncResult ar)
       at System.Threading.Tasks.TaskFactory`1.FromAsyncCoreLogic(IAsyncResult iar, Func`2 endFunction, Action`1 endAction, Task`1 promise, Boolean requiresSynchronization)
What is causing this problem, and how could I go about preventing, avoiding, or tolerating this problem?