Enhance synchronous software API to allow asynchronous consuming












1















I have a module in Python 3.5+, providing a function that reads some data from a remote web API and returns it. The function relies on a wrapper function, which in turn uses the library requests to make the HTTP call.



Here it is (omitting on purpose all data validation logic and exception handling):



# module fetcher.py

import requests

# high-level module API
def read(some_params):
resp = requests.get('http://example.com', params=some_params)
return resp.json()

# wrapper for the actual remote API call
def get_data(some_params):
return call_web_api(some_params)


The module is currently imported and used by multiple clients.



As of today, the call to get_data is inherently synchronous: this means that whoever uses the function fetcher.read() knows that this is going to block the thread the function is executed on.



What I would love to achieve



I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).
This is in order to keep compatibility with existing callers consuming the module and at the same time to offer the possibility
to leverage non-blocking calls to allow a better throughput for callers that do want to call the function asynchronously.



This said, my legitimate wish is to modify the original code as little as possible...



As of today, the only thing I know is that Requests does not support asynchronous operations out of the box and therefore I should switch to an asyncio-friendly HTTP client (eg. aiohttp) in order to provide a non-blocking behaviour



How would the above code need to be modified to meet my desiderata? Which also leads me to ask: is there any best practice about enhancing sync software APIs to async contexts?










share|improve this question



























    1















    I have a module in Python 3.5+, providing a function that reads some data from a remote web API and returns it. The function relies on a wrapper function, which in turn uses the library requests to make the HTTP call.



    Here it is (omitting on purpose all data validation logic and exception handling):



    # module fetcher.py

    import requests

    # high-level module API
    def read(some_params):
    resp = requests.get('http://example.com', params=some_params)
    return resp.json()

    # wrapper for the actual remote API call
    def get_data(some_params):
    return call_web_api(some_params)


    The module is currently imported and used by multiple clients.



    As of today, the call to get_data is inherently synchronous: this means that whoever uses the function fetcher.read() knows that this is going to block the thread the function is executed on.



    What I would love to achieve



    I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).
    This is in order to keep compatibility with existing callers consuming the module and at the same time to offer the possibility
    to leverage non-blocking calls to allow a better throughput for callers that do want to call the function asynchronously.



    This said, my legitimate wish is to modify the original code as little as possible...



    As of today, the only thing I know is that Requests does not support asynchronous operations out of the box and therefore I should switch to an asyncio-friendly HTTP client (eg. aiohttp) in order to provide a non-blocking behaviour



    How would the above code need to be modified to meet my desiderata? Which also leads me to ask: is there any best practice about enhancing sync software APIs to async contexts?










    share|improve this question

























      1












      1








      1








      I have a module in Python 3.5+, providing a function that reads some data from a remote web API and returns it. The function relies on a wrapper function, which in turn uses the library requests to make the HTTP call.



      Here it is (omitting on purpose all data validation logic and exception handling):



      # module fetcher.py

      import requests

      # high-level module API
      def read(some_params):
      resp = requests.get('http://example.com', params=some_params)
      return resp.json()

      # wrapper for the actual remote API call
      def get_data(some_params):
      return call_web_api(some_params)


      The module is currently imported and used by multiple clients.



      As of today, the call to get_data is inherently synchronous: this means that whoever uses the function fetcher.read() knows that this is going to block the thread the function is executed on.



      What I would love to achieve



      I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).
      This is in order to keep compatibility with existing callers consuming the module and at the same time to offer the possibility
      to leverage non-blocking calls to allow a better throughput for callers that do want to call the function asynchronously.



      This said, my legitimate wish is to modify the original code as little as possible...



      As of today, the only thing I know is that Requests does not support asynchronous operations out of the box and therefore I should switch to an asyncio-friendly HTTP client (eg. aiohttp) in order to provide a non-blocking behaviour



      How would the above code need to be modified to meet my desiderata? Which also leads me to ask: is there any best practice about enhancing sync software APIs to async contexts?










      share|improve this question














      I have a module in Python 3.5+, providing a function that reads some data from a remote web API and returns it. The function relies on a wrapper function, which in turn uses the library requests to make the HTTP call.



      Here it is (omitting on purpose all data validation logic and exception handling):



      # module fetcher.py

      import requests

      # high-level module API
      def read(some_params):
      resp = requests.get('http://example.com', params=some_params)
      return resp.json()

      # wrapper for the actual remote API call
      def get_data(some_params):
      return call_web_api(some_params)


      The module is currently imported and used by multiple clients.



      As of today, the call to get_data is inherently synchronous: this means that whoever uses the function fetcher.read() knows that this is going to block the thread the function is executed on.



      What I would love to achieve



      I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).
      This is in order to keep compatibility with existing callers consuming the module and at the same time to offer the possibility
      to leverage non-blocking calls to allow a better throughput for callers that do want to call the function asynchronously.



      This said, my legitimate wish is to modify the original code as little as possible...



      As of today, the only thing I know is that Requests does not support asynchronous operations out of the box and therefore I should switch to an asyncio-friendly HTTP client (eg. aiohttp) in order to provide a non-blocking behaviour



      How would the above code need to be modified to meet my desiderata? Which also leads me to ask: is there any best practice about enhancing sync software APIs to async contexts?







      python asynchronous refactoring python-asyncio






      share|improve this question













      share|improve this question











      share|improve this question




      share|improve this question










      asked Nov 28 '18 at 14:28









      csparpacsparpa

      2971619




      2971619
























          1 Answer
          1






          active

          oldest

          votes


















          1















          I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).




          I don't think it is feasible for the same function to be usable via both sync and async API because the usage patterns are so different. Even if you could somehow make it work, it would be just too easy to mess things up, especially taking into account Python's dynamic-typing nature. (For example, users might accidentally forget to await their functions in async code, and the sync code would kick in, thus blocking their event loop.)



          Instead, I would recommend the actual API to be async, and to create a trivial sync wrapper that just invokes the entry points using run_until_complete. Something along these lines:



          # new module afetcher.py (or fetcher_async, or however you like it)

          import aiohttp

          # high-level module API
          async def read(some_params):
          async with aiohttp.request('GET', 'http://example.com', params=some_params) as resp:
          return await resp.json()

          # wrapper for the actual remote API call
          async def get_data(some_params):
          return call_web_api(some_params)


          Yes, you switch from using requests to aiohttp, but the change is mechanical as the APIs are very similar in spirit.



          The sync module would exist for backward compatibility and convenience, and would trivially wrap the async functionality:



          # module fetcher.py

          import afetcher

          def read(some_params):
          loop = asyncio.get_event_loop()
          return loop.run_until_complete(afetcher.read(some_params))

          ...


          This approach provides both sync and async version of the API, without code duplication because the sync version consists of trivial trampolines, whose definition can be further compressed using appropriate decorators.



          The async fetcher module should have a nice short name, so that the users don't feel punished for using the async functionality. It should be easy to use, and it actually provides a lot of new features compared to the sync API, most notably low-overhead parallelization and reliable cancellation.



          The route that is not recommended is using run_in_executor or similar thread-based tool to run requests in a thread pool under the hood. That implementation doesn't provide the actual benefits of using asyncio, but incurs all the costs. In that case it is better to continue providing the synchronous API and leave it to the users to use concurrent.futures or similar tools for parallel execution, where they're at least aware they're using threads.






          share|improve this answer

























            Your Answer






            StackExchange.ifUsing("editor", function () {
            StackExchange.using("externalEditor", function () {
            StackExchange.using("snippets", function () {
            StackExchange.snippets.init();
            });
            });
            }, "code-snippets");

            StackExchange.ready(function() {
            var channelOptions = {
            tags: "".split(" "),
            id: "1"
            };
            initTagRenderer("".split(" "), "".split(" "), channelOptions);

            StackExchange.using("externalEditor", function() {
            // Have to fire editor after snippets, if snippets enabled
            if (StackExchange.settings.snippets.snippetsEnabled) {
            StackExchange.using("snippets", function() {
            createEditor();
            });
            }
            else {
            createEditor();
            }
            });

            function createEditor() {
            StackExchange.prepareEditor({
            heartbeatType: 'answer',
            autoActivateHeartbeat: false,
            convertImagesToLinks: true,
            noModals: true,
            showLowRepImageUploadWarning: true,
            reputationToPostImages: 10,
            bindNavPrevention: true,
            postfix: "",
            imageUploader: {
            brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
            contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
            allowUrls: true
            },
            onDemand: true,
            discardSelector: ".discard-answer"
            ,immediatelyShowMarkdownHelp:true
            });


            }
            });














            draft saved

            draft discarded


















            StackExchange.ready(
            function () {
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53521716%2fenhance-synchronous-software-api-to-allow-asynchronous-consuming%23new-answer', 'question_page');
            }
            );

            Post as a guest















            Required, but never shown

























            1 Answer
            1






            active

            oldest

            votes








            1 Answer
            1






            active

            oldest

            votes









            active

            oldest

            votes






            active

            oldest

            votes









            1















            I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).




            I don't think it is feasible for the same function to be usable via both sync and async API because the usage patterns are so different. Even if you could somehow make it work, it would be just too easy to mess things up, especially taking into account Python's dynamic-typing nature. (For example, users might accidentally forget to await their functions in async code, and the sync code would kick in, thus blocking their event loop.)



            Instead, I would recommend the actual API to be async, and to create a trivial sync wrapper that just invokes the entry points using run_until_complete. Something along these lines:



            # new module afetcher.py (or fetcher_async, or however you like it)

            import aiohttp

            # high-level module API
            async def read(some_params):
            async with aiohttp.request('GET', 'http://example.com', params=some_params) as resp:
            return await resp.json()

            # wrapper for the actual remote API call
            async def get_data(some_params):
            return call_web_api(some_params)


            Yes, you switch from using requests to aiohttp, but the change is mechanical as the APIs are very similar in spirit.



            The sync module would exist for backward compatibility and convenience, and would trivially wrap the async functionality:



            # module fetcher.py

            import afetcher

            def read(some_params):
            loop = asyncio.get_event_loop()
            return loop.run_until_complete(afetcher.read(some_params))

            ...


            This approach provides both sync and async version of the API, without code duplication because the sync version consists of trivial trampolines, whose definition can be further compressed using appropriate decorators.



            The async fetcher module should have a nice short name, so that the users don't feel punished for using the async functionality. It should be easy to use, and it actually provides a lot of new features compared to the sync API, most notably low-overhead parallelization and reliable cancellation.



            The route that is not recommended is using run_in_executor or similar thread-based tool to run requests in a thread pool under the hood. That implementation doesn't provide the actual benefits of using asyncio, but incurs all the costs. In that case it is better to continue providing the synchronous API and leave it to the users to use concurrent.futures or similar tools for parallel execution, where they're at least aware they're using threads.






            share|improve this answer






























              1















              I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).




              I don't think it is feasible for the same function to be usable via both sync and async API because the usage patterns are so different. Even if you could somehow make it work, it would be just too easy to mess things up, especially taking into account Python's dynamic-typing nature. (For example, users might accidentally forget to await their functions in async code, and the sync code would kick in, thus blocking their event loop.)



              Instead, I would recommend the actual API to be async, and to create a trivial sync wrapper that just invokes the entry points using run_until_complete. Something along these lines:



              # new module afetcher.py (or fetcher_async, or however you like it)

              import aiohttp

              # high-level module API
              async def read(some_params):
              async with aiohttp.request('GET', 'http://example.com', params=some_params) as resp:
              return await resp.json()

              # wrapper for the actual remote API call
              async def get_data(some_params):
              return call_web_api(some_params)


              Yes, you switch from using requests to aiohttp, but the change is mechanical as the APIs are very similar in spirit.



              The sync module would exist for backward compatibility and convenience, and would trivially wrap the async functionality:



              # module fetcher.py

              import afetcher

              def read(some_params):
              loop = asyncio.get_event_loop()
              return loop.run_until_complete(afetcher.read(some_params))

              ...


              This approach provides both sync and async version of the API, without code duplication because the sync version consists of trivial trampolines, whose definition can be further compressed using appropriate decorators.



              The async fetcher module should have a nice short name, so that the users don't feel punished for using the async functionality. It should be easy to use, and it actually provides a lot of new features compared to the sync API, most notably low-overhead parallelization and reliable cancellation.



              The route that is not recommended is using run_in_executor or similar thread-based tool to run requests in a thread pool under the hood. That implementation doesn't provide the actual benefits of using asyncio, but incurs all the costs. In that case it is better to continue providing the synchronous API and leave it to the users to use concurrent.futures or similar tools for parallel execution, where they're at least aware they're using threads.






              share|improve this answer




























                1












                1








                1








                I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).




                I don't think it is feasible for the same function to be usable via both sync and async API because the usage patterns are so different. Even if you could somehow make it work, it would be just too easy to mess things up, especially taking into account Python's dynamic-typing nature. (For example, users might accidentally forget to await their functions in async code, and the sync code would kick in, thus blocking their event loop.)



                Instead, I would recommend the actual API to be async, and to create a trivial sync wrapper that just invokes the entry points using run_until_complete. Something along these lines:



                # new module afetcher.py (or fetcher_async, or however you like it)

                import aiohttp

                # high-level module API
                async def read(some_params):
                async with aiohttp.request('GET', 'http://example.com', params=some_params) as resp:
                return await resp.json()

                # wrapper for the actual remote API call
                async def get_data(some_params):
                return call_web_api(some_params)


                Yes, you switch from using requests to aiohttp, but the change is mechanical as the APIs are very similar in spirit.



                The sync module would exist for backward compatibility and convenience, and would trivially wrap the async functionality:



                # module fetcher.py

                import afetcher

                def read(some_params):
                loop = asyncio.get_event_loop()
                return loop.run_until_complete(afetcher.read(some_params))

                ...


                This approach provides both sync and async version of the API, without code duplication because the sync version consists of trivial trampolines, whose definition can be further compressed using appropriate decorators.



                The async fetcher module should have a nice short name, so that the users don't feel punished for using the async functionality. It should be easy to use, and it actually provides a lot of new features compared to the sync API, most notably low-overhead parallelization and reliable cancellation.



                The route that is not recommended is using run_in_executor or similar thread-based tool to run requests in a thread pool under the hood. That implementation doesn't provide the actual benefits of using asyncio, but incurs all the costs. In that case it is better to continue providing the synchronous API and leave it to the users to use concurrent.futures or similar tools for parallel execution, where they're at least aware they're using threads.






                share|improve this answer
















                I want to allow the fetcher.read() to be run both in a synchronous and an asynchronous fashion (eg. via an event loop).




                I don't think it is feasible for the same function to be usable via both sync and async API because the usage patterns are so different. Even if you could somehow make it work, it would be just too easy to mess things up, especially taking into account Python's dynamic-typing nature. (For example, users might accidentally forget to await their functions in async code, and the sync code would kick in, thus blocking their event loop.)



                Instead, I would recommend the actual API to be async, and to create a trivial sync wrapper that just invokes the entry points using run_until_complete. Something along these lines:



                # new module afetcher.py (or fetcher_async, or however you like it)

                import aiohttp

                # high-level module API
                async def read(some_params):
                async with aiohttp.request('GET', 'http://example.com', params=some_params) as resp:
                return await resp.json()

                # wrapper for the actual remote API call
                async def get_data(some_params):
                return call_web_api(some_params)


                Yes, you switch from using requests to aiohttp, but the change is mechanical as the APIs are very similar in spirit.



                The sync module would exist for backward compatibility and convenience, and would trivially wrap the async functionality:



                # module fetcher.py

                import afetcher

                def read(some_params):
                loop = asyncio.get_event_loop()
                return loop.run_until_complete(afetcher.read(some_params))

                ...


                This approach provides both sync and async version of the API, without code duplication because the sync version consists of trivial trampolines, whose definition can be further compressed using appropriate decorators.



                The async fetcher module should have a nice short name, so that the users don't feel punished for using the async functionality. It should be easy to use, and it actually provides a lot of new features compared to the sync API, most notably low-overhead parallelization and reliable cancellation.



                The route that is not recommended is using run_in_executor or similar thread-based tool to run requests in a thread pool under the hood. That implementation doesn't provide the actual benefits of using asyncio, but incurs all the costs. In that case it is better to continue providing the synchronous API and leave it to the users to use concurrent.futures or similar tools for parallel execution, where they're at least aware they're using threads.







                share|improve this answer














                share|improve this answer



                share|improve this answer








                edited Dec 4 '18 at 14:31

























                answered Nov 28 '18 at 15:34









                user4815162342user4815162342

                63.9k594150




                63.9k594150
































                    draft saved

                    draft discarded




















































                    Thanks for contributing an answer to Stack Overflow!


                    • Please be sure to answer the question. Provide details and share your research!

                    But avoid



                    • Asking for help, clarification, or responding to other answers.

                    • Making statements based on opinion; back them up with references or personal experience.


                    To learn more, see our tips on writing great answers.




                    draft saved


                    draft discarded














                    StackExchange.ready(
                    function () {
                    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstackoverflow.com%2fquestions%2f53521716%2fenhance-synchronous-software-api-to-allow-asynchronous-consuming%23new-answer', 'question_page');
                    }
                    );

                    Post as a guest















                    Required, but never shown





















































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown

































                    Required, but never shown














                    Required, but never shown












                    Required, but never shown







                    Required, but never shown







                    Popular posts from this blog

                    A CLEAN and SIMPLE way to add appendices to Table of Contents and bookmarks

                    Calculate evaluation metrics using cross_val_predict sklearn

                    Insert data from modal to MySQL (multiple modal on website)