Comment by hn_throwaway_99

Comment by hn_throwaway_99 2 days ago

74 replies

> The main problem GraphQL tries to solve is overfetching.

My issue with this article is that, as someone who is a GraphQL fan, that is far from what I see as its primary benefit, and so the rest of the article feels like a strawman to me.

TBH I see the biggest benefits of GraphQL are that it (a) forces a much tighter contract around endpoint and object definition with its type system, and (b) schema evolution is much easier than in other API tech.

For the first point, the entire ecosystem guarantees that when a server receives an input object, that object will conform to the type, and similarly, a client receiving a return object is guaranteed to conform to the endpoint response type. Coupled with custom scalar types (e.g. "phone number" types, "email address" types), this can eliminate a whole class of bugs and security issues. Yes, other API tech does something similar, but I find the guarantees are far less "guaranteed" and it's much easier to have errors slip through. Like GraphQL always prunes return objects to just the fields requested, which most other API tech doesn't do, and this can be a really nice security benefit.

When it comes to schema evolution, I've found that adding new fields and deprecating old ones, and especially that new clients only ever have to be concerned with the new fields, is a huge benefit. Again, other API tech allows you to do something like this, but it's much less standardized and requires a lot more work and cognitive load on both the server and client devs.

jakubriedl a day ago

I 100% agree that overfetching isn't the main problem graphql solves for me.

I'm actually spending a lot of time in rest-ish world and contract isn't the problem I'd solve with GraphQL either. For that I'd go through OpenAPI, and it's enforcement and validation. That is very viable these days, just isn't a "default" in the ecosystem.

For me what GraphQL solves as main problem, which I haven't got good alternative for is API composition and evolution especially in M:N client-services scenario in large systems. Having the mindset of "client describes what they need" -> "graphql server figures out how to get it" -> "domain services resolve the part" makes long term management of network of APIs much easier. And when it's combined with good observability it can become one of the biggest enablers for data access.

  • Seattle3503 a day ago

    > For me what GraphQL solves as main problem, which I haven't got good alternative for is API composition and evolution especially in M:N client-services scenario in large systems. Having the mindset of "client describes what they need" -> "graphql server figures out how to get it" -> "domain services resolve the part" makes long term management of network of APIs much easier. And when it's combined with good observability it can become one of the biggest enablers for data access.

    I've seen this this solved in REST land by using a load balancer or proxy that does path based routing. api.foo.com/bar/baz gets routed to the "bar" service.

    • btreecat a day ago

      Doesn't even need to be a proxy, you can lay out your controller and endpoints like this just fine in most modern frameworks

  • hn_throwaway_99 a day ago

    Completely agree with this rationale too. GraphQL does encapsulation really, really well. The client just knows about a single API surface, but the implementation about which actual backend services are handling the (parts of each) call is completely hidden.

    On a related note, this is also why I really dislike those "Hey, just expose your naked DB schemas as a GraphQL API!" tools. Like the best part about GraphQL is how it decouples your API contract from backend implementation details, and these tools come along and now you've tightly coupled all your clients to your DB schema. I think it's madness.

    • sandeepkd a day ago

      I have used, implemented graphQL in two large scale companies across multiple (~xx) services. There are similarities in how it unfolds, however I have not seen any real world problem being solved with this so far

      1. The main argument to introduce has always been the appropriate data fetching for the clients where clients can describe exactly whats required

      2. Ability to define schema is touted as an advantage, managing the schema becomes a nightmare.( Btw the schema already exits at the persistence layer if that was required, schema changes and schema migration are already challenging, you just happen to replicate the challenge in one additional layer with graphQL)

      3. You go big and you get into graphQL servers calling into other graphQL servers and thats when things become really interesting. People do not realize/remember/care the source of the data, you have name collisions, you get into namespaces

      4. You started on the pretext of optimizing the query and now you have this layer that your client works with, the natural flow is to implement mutations with GraphQL.

      5. Things are downhill from this point, with distributed services you had already lost on transactionality, graphQL mutations just add to it. You get into circular references cause underlying services are just calling other services via graphQL to get the data you asked for with graphQL query

      6. The worst, you do not want to have too many small schema objects so now you have this one big schema that gets you everything from multiple REST API end points and clients are back to where they started from. Pick what you need to display on the screen.

      7. Open up the network tab of any *enterprise application which uses graphQL and it would be easy to see how much non-usable data is fetched via graphQL for displaying simplistic pages

      There is nothing wrong about graphQL, pretty much applies to all the tools. Comes down to how you use it, how good you are at understanding the trade-offs. Treating anything like a silver bullet is going to lead in the same direction. Pretty much all engineers who operated at the application scale is aware of it, unfortunately they just stay quiet

    • dfee a day ago

      I agree as well. This may be the only thing GraphQL excels at. Dataloader implementations give this superpowers.

      OpenAPI, Thrift and protobuf/gRPC are all far better schema languages. For example: the separation of input types and object types.

lateforwork 2 days ago

If you generate TypeScript types from OpenAPI specs then you get contracts for both directions. There is no problem here for GraphQL to solve.

  • WickyNilliams a day ago

    This is very much possible, and I have done it, and it works great once it's all wired up.

    But OpenAPI is verbose to the point of absurdity. You can't feasibly write it by hand. So you can't do schema first development. You need an open API compatible lib for authoring your API, you need some tooling to generate the schema from the code, then you need another tool to generate types from the schema. Each step tends to implement the spec to varying degrees, creating gaps in types, or just outright failing.

    Fwiw I tried many, many tools to generate the typescript from the schema. Most resulted in horrendous, bloated code. The official generators especially. Many others just choked on a complex schema, or used basic string concatenation to output the typescript leading to invalid code. Additionally the cost of the generated code scales with the schema size, which can mean shipping huge chunks of code to the client as your API evolves

    The tool I will wholeheartedly recommend (and which I am unaffiliated beside making a few PRs) is openapi-ts. It is fast and correct, and you pay a fixed cost - there's a fetch wrapper for runtime and everything else exists at the type level.

    I was kinda surprised how bad a lot of the tooling was considering how mature OpenAPI is. Perhaps it's advanced in the last year or so, when I stopped working on the project where I had to do this.

    https://openapi-ts.dev/

    • jcgl an hour ago

      Agree with the other comments about writing OpenAPI by hand. It’s really not that bad at all, and most certainly not “verbose to the point of absurdity.”

      Moreover, system boundaries are the best places to invest in being explicit. OpenAPI specs really don’t have that much overhead (especially if you make use of YAML anchors), and are (usually) suitably descriptive to describe the boundary.

      In any case, starting with a declarative contract/IDL and doing something like codegen is a great way to go.

    • 0x696C6961 a day ago

      I write all of my openapi specs by hand. It's not hard.

      • WickyNilliams a day ago

        I imagine you are very much in the minority. A simple hello world is like a screen full of yaml. The equivalent in graphql (or typespec which I always wanted to try as an authoring format for openapi https://typespec.io/) would be a few lines

      • aitchnyu a day ago

        Do you validate responses from client-side and server-side(Fastapi does this and prevents invalid responses from being sent) from spec?

    • hokkos a day ago

      I use https://typespec.io to generate openapi, writing openapi yaml quickly became horrible past a few apis.

      • WickyNilliams a day ago

        Ha yes, see one of my other comments to another reply.

        I never got to use it when I last worked with OpenAPI but it seemed like the antidote to the verbosity. Glad to hear someone had positive experience with it. I'll definitely try it next time I get the chance

  • c-hendricks 2 days ago

    What about the whole "graph" part? Are there any openapi libraries that deal with that?

    • lateforwork 2 days ago

      OpenAPI definition includes class hierarchy as well. You can use tools to generate TypeScript type definitions from that.

  • komali2 2 days ago

    Discovering Kubb was a game changer for me last year.

    • HumanOstrich 2 days ago

      Thanks for mentioning this. I always find it unsettling when I've researched solutions for something and only find a better option from a random HN comment.

      Site: https://kubb.dev/

      • WickyNilliams a day ago

        Fwiw I tried every tool imaginable a few years ago including kubb, (which I think I contributed to while testing things out)

        The only mature, correct, fast option with a fixed cost (since it mostly exists at the type level meaning it doesn't scale your bundle with your API) was openapi-ts. I am not affiliated other than a previous happy user, though I did make some PRs while using it https://openapi-ts.dev/

      • bakugo 2 days ago

        This project seems to be mostly AI generated, so keep that in mind before replacing any existing solutions.

  • iterateoften 2 days ago

    Graphql solves the problem. There is no problem here for openapi to solve.

    See how that works?

    • thayne 2 days ago

      Openapi is older than graphql.

      But the point is that that benefit is not unique to graphql, so by itself, that is not a compelling reason to choose graphql over something else.

      • iterateoften 2 days ago

        Yeah that was one point of many of the benefits of the parent.

      • tt_dev 2 days ago

        plus now you have 2 sources of truth

        • iterateoften 2 days ago

          ? I have a single source of truth in the gql schema. My frontend calls are generated from backend schema and type checked against it.

  • bastawhiz a day ago

    tRPC sort of does this (there's no spec, but you don't need a spec because the interface is managed by tRPC on both sides). But it loses the real main defining quality of gql: not needing subsequent requests.

    If I need more information about a resource that an endpoint exposes, I need another request. If I'm looking at a podcast episode, I might want to know the podcast network that the show belongs to. So first I have to look up the podcast from the id on the episode. Then I have to look up the network by the id on the podcast. Now, two requests later, I can get the network details. GQL gives that to me in one query, and the fundamental properties of what makes GQL GQL are what enables that.

    Yes, you can jam podcast data on the episode, and network data inside of that. But now I need a way to not request all that data so I'm not fetching it in all the places where I don't need it. So maybe you have an "expand" parameter: this is what Stripe does. And really, you've just invented a watered down, bespoke GraphQL.

    • lateforwork a day ago

      Is dealing with GQL easier than implementing a BFF? There may be cases where that is true, but it is not always true.

      • bastawhiz a day ago

        I think BFF works at a small scale, but that's true with any framework. Building a one off handful of endpoints will always be less work than putting a framework in place and building against it.

        GQL has a pretty substantial up front cost, undeniably. But you hopefully balance that with the benefit you'd get from it.

  • [removed] 2 days ago
    [deleted]
  • mixedCase a day ago

    If you generate OpenAPI specs, and clients, and server type definitions from a declarative API definition made with Effect's own @effect/platform, it solves even more things in a nicer, more robust fashion.

hjnilsson 2 days ago

Agree whole-heartedly. The strong contracts are the #1 reason to use GraphQL.

The other one I would mention is the ability to very easily reuse resolvers in composition, and even federate them. Something that can be very clunky to get right in REST APIs.

  • specialp 2 days ago

    Contracts for data with OpenAPI or an RPC don't come with the overhead of making a resolver for infinite permutations while your apps probably need a few or perhaps one. Which is why REST and something for validation is enough for most and doesn't cost as much.

  • verdverm 2 days ago

    re:#1 Is there a meaningful difference between GraphQl and OpenAPI here?

    Composed resolvers are the headache for most and not seen as a net benefit, you can have proxied (federated) subsets of routes in REST, that ain't hard at all

    • JasonSage 2 days ago

      > Composed resolvers are the headache for most and not seen as a net benefit, you can have proxied (federated) subsets of routes in REST, that ain't hard at all

      Right, so if you take away the resolver composition (this is graph composition and not route federation), you can do the same things with a similar amount of effort in REST. This is no longer a GraphQL vs REST conversation, it's an acknowledgement that if you don't want any of the benefits you won't get any of the benefits.

      • verdverm 2 days ago

        There are pros & cons to GraphQL resolver composition, not just benefits.

        It is that very compositional graph resolving that makes many see it as overly complex, not as a benefit, but as a detriment. You seem to imply that the benefit is guaranteed and that graph resolving cannot be done within a REST handler, which it can be, but it's much simpler and easier to reason about. I'm still going to go get the same data, but with less complexity and reasoning overhead than using the resolver composition concept from GraphQL.

        Is resolver composition really that different from function composition?

8n4vidtmkvmk 2 days ago

Pruning the request and even the response is pretty trivial with zod. I wouldn't onboard GQL for that alone.

Not sure about the schema evolution part. Protobufs seem to work great for that.

  • hn_throwaway_99 2 days ago

    > Pruning the request and even the response is pretty trivial with zod.

    I agree with that, and when I'm in a "typescript only" ecosystem, I've switched to primarily using tRPC vs. GraphQL.

    Still, I think people tend to underestimate the value of having such clear contracts and guarantees that GraphQL enforces (not to mention it's whole ecosystem of tools), completely outside of any code you have to write. Yes, you can do your own zod validation, but in a large team as an API evolves and people come and go, having hard, unbreakable lines in the sand (vs. something you have to roll your own, or which is done by convention) is important IMO.

  • hamandcheese a day ago

    In my (now somewhat dated) graphql experience, evolving an API is much harder. Input parameters in particular. If a server gets inputs it doesn't recognize, or if client and server disagree that a field is optional or not (even if a value was still supplied for it so the question is moot), the server will reject the request.

    • hdjrudni 15 hours ago

      > If a server gets inputs it doesn't recognize

      If you just slap in Zod, the server will drop the extra inputs. If you hate Zod, it's not hard to design a similar thing.

      > or if client and server disagree that a field is optional or not

      Doesn't GQL have the concept of required vs optional fields too? IIUC it's the same problem. You just have to be very diligent about this, not really a way around it. Protobufs went as far as to remove 'required' out of the spec because this was such a common problem. Just don't make things required, ever :-)

  • FootballMuse 2 days ago

    Pruning a response does nothing since everything still goes across the network

    • hdjrudni 2 days ago

      Pruning the response would help validate your response schema is correct and that is delivering what was promised.

      But you're right, if you have version skew and the client is expecting something else then it's not much help.

      You could do it client-side so that if the server adds an optional field the client would immediately prune it off. If it removes a field, it could fill it with a default. At a certain point too much skew will still break something, but that's probably what you want anyway.

    • hn_throwaway_99 2 days ago

      You're misunderstanding. In GraphQL, the server prunes the response object. That is, the resolver method can return a "fat" object, but only the object pruned down to just the requested fields is returned over the wire.

      It is an important security benefit, because one common attack vector is to see if you can trick a server method into returning additional privileged data (like detailed error responses).

      • JAlexoid a day ago

        I would like to remind you that in most cases the GQL is not colocated on the same hardware as the services it queries.

        Therefore requests between GQL and downstream services are travelling "over the wire" (though I don't see it as an issue)

        Having REST apis that return only "fat" objects is really not the most secure way of designing APIs

      • fastball a day ago

        "Just the requested fields" as requested by the client?

        Because if so that is no security benefit at all, because I can just... request the fat fields.

tomnipotent a day ago

Facebook had started bifurcating API endpoints to support iOS vs Android vs Web, and overtime a large number of OS-specific endpoints evolved. A big part of their initial GraphQL marketing was to solve for this problem specifically.

dgan 2 days ago

Sorry but not convinced. How is this different from two endpoints communicating through, lets say, protobuf? Both input and output will be (un)parsed only when conforming to the definition

scotty79 a day ago

> when a server receives an input object, that object will conform to the type

Anything that comes from the front end can be tampered with. Server is guaranteed nothing.

> GraphQL always prunes return objects to just the fields requested, which most other API tech doesn't do, and this can be a really nice security benefit.

Request can be tampered with so there's additional security from GraphQL protocol. Security must be implemented by narrowing down to only allowed data on the server side. How much of it is requested doesn't matter for security.

  • JAlexoid a day ago

    Expecting GraphQL to handle security is really one of the poorest ways of doing security, as GQL is not designed to do that.

    • scotty79 a day ago

      Sorry, I made a typo:

      Request can be tampered with so there's *NO additional security from GraphQL protocol.