cj 3 days ago

This is cool, but I wish it were integrated into tools already used for coding and writing rather than having it be a separate app.

This also demonstrates the type of things Google could do with Gemini integrated into Google Docs if they step up their game a bit.

Honestly I’m scratching my head on OpenAI’s desire to double down on building out their consumer B2C use cases rather than truly focussing on being the infrastructure/API provider for other services to plug into. If I had to make a prediction, I think OpenAI will end up being either an infrastructure provider OR a SaaS, but not both, in the long-term (5-10 yrs from now).

  • jcfrei 3 days ago

    When they are focusing on just being an API provider then they will be in a market with (long term) razor thin margins and high competition - most likely unable to build a deep moat. But if you can shape customers habits to always input "chatgpt.com" into the browser whenever they want to use AI then that's a very powerful moat. Those customers will also most likely be on a subscription basis, meaning much more flexibility in pricing and more rent for openAI (people using it less then what OpenAI calculates for subscription costs).

    • james_marks 3 days ago

      I agree, and it’s why I have come to dislike OpenAI.

      We are getting front row seats to an object lesson in “absolute power corrupts absolutely”, and I am relieved they have a host of strong competitors.

    • adventured 3 days ago

      The difference between Google had it just tried to be an enterprise search API, versus owning the consumer destination for search input/results.

      • cynicalpeace 3 days ago

        Google will be a remembered as a victim of Schumpeter's Creative Destruction

        • james_marks 3 days ago

          From Wikipedia, for that don’t know the term: “a concept in economics that describes a process in which new innovations replace and make obsolete older innovations.”

          Ironically, I had to google it, and agree with the comment.

          • satvikpendem 3 days ago

            You should read The Innovator's Dilemma as well, as it goes into detail on this concept, basically explaining why and how technological disruption occurs from the point of view of the disruptor and disruptee.

    • movedx 19 hours ago

      This sounds like a good way of guaranteeing profits and moving the mindset away from just producing good products.

    • aixpert 2 days ago

      Your argument, which could be correct, makes their choice of the name ChatPGT* even more idiotic

      * Or which ever variant the average user might try to type in

  • svat 3 days ago

    > the type of things Google could do with Gemini integrated into Google Docs

    Google already does have this in Google Docs (and all their products)? You can ask it questions about the current doc, select a paragraph and ask click on "rewrite", things like that. Has helped me get over writer's block at least a couple of times. Similarly for making slides etc. (It requires the paid subscription if you want to use it from a personal account.)

    https://support.google.com/docs/answer/13951448 shows some of it for Docs, and https://support.google.com/mail/answer/13447104 is the one for various Workspace products.

    • Zinu 3 days ago

      Those look more like one-off prompts, and not a proper chat/collab with Gemini.

      • svat 3 days ago

        That's there too; see https://support.google.com/docs/answer/14206696 — you can click on the "Ask Gemini ⟡" and carry on a conversation, e.g. "summarize emails about <topic>" and use those to paste into the doc. (I haven't found all that much use for referencing other files though. But the "proper chat" is useful for saying things like "no actually I meant something more like: …" and carrying on.)

  • al_borland 3 days ago

    I wouldn't be surprised to see Apple add something like this to Pages and some of their other apps. Their approach to AI, from what we've seen so far, has been about integrating it into existing apps and experiences, rather than making a separate AI app. I have to imagine this is the way forward, and these stand alone apps are basically tech demos for what is possible, rather than end-state for how it should be consumed by the masses.

    I agree with you on where OpenAI will/should sit in 5-10 years. However, I don't think them building the occasional tool like this is unwarranted, as it helps them show the direction companies could/should head with integration into other tools. Before Microsoft made hardware full time, they would occasionally produce something (or partner with brands) to show a new feature Windows supports as a way to tell the OEMs out there, "this is what we want you to do and the direction we'd like the PC to head." The UMPC[0] was one attempt at this which didn't take off. Intel also did something like this with the NUC[1]. I view what OpenAI is doing as a similar concept, but applied to software.

    [0] https://en.wikipedia.org/wiki/Ultra-mobile_PC

    [1] https://en.wikipedia.org/wiki/Next_Unit_of_Computing

    • acchow 3 days ago

      Every app with a significant installed user base is adding AI features.

      OP is lamenting that Cursor and OpenAI chose to create new apps instead of integrating with (someone else’s) existing apps. But this is a result of a need to be always fully unblocked.

      Also, owning the app opens up greater financial potential down the line…

    • rty32 2 days ago

      How many people use Pages these days? I don't think Apple even mentions the product in their WWDC these days. My guess is that most people either use Microsoft suite as required by their employer or use cloud based knowledge base/notes tools like Notion/Quip/Obsidian/Confluence etc. I doubt Apple thinks it worthwhile to invest in these products.

      • al_borland 2 days ago

        People who need to make the occasional document outside of work, who don’t need to invest in paying for Office, use iWork. I count myself in that list. I use Office at work (99% of that usage is Excel), but at home I use the iWork apps. Mostly Numbers, but Pages as well. I hear many of my friends and family doing the same, because it’s what they have, it’s good enough, and it’s free.

        Few people outside of tech circles know what those other apps you mentioned are. I use Confluence at work, because it’s what my company uses. I also tried using it at home, but not for the same stuff I’d use Pages for. I use Obsidian at work to stay organized, but again, it doesn’t replace what I’d use Pages for, it’s more of a Notes competitor in my book. A lot of people don’t want their documents locked away in a Notion DB, and it’s not something I’d think to use if I’m looking to print something.

        I went back and looked at the last WWDC video. Apple did mention the apps briefly, to say they have integrated Image Playgrounds, their AI image generation, into Pages, Keynote, and Numbers. With each major upgrade, the iWork apps usually get something. Office productivity isn’t exactly the center of innovation these days. The apps already do the things that 80% of users need.

  • serjester 3 days ago

    75% of OpenAI's revenue is coming from their consumer business - the better question is the long term viability of their public API.

    But if they believe they're going to reach AGI, it makes no sense to pigeonhole themselves to the interface of ChatGPT. Seems like a pretty sensible decision to maintain both.

    • 8338550bff96 3 days ago

      75%? Thats astonishing to me. Where are you able to see those details?

      It wouldn't surprise me if not a lot of enterprises are going through OpenAI's enterprise agreements - most already have a relationship with Microsoft in one capacity or another so going through Azure just seems like the lowest friction way to get access. If how many millions we spend on tokens through Azure to OpenAI is any indication of what other orgs are doing, I would expect consumer's $20/month to be a drop in the bucket.

      • jdgoesmarching 3 days ago

        This very good analysis estimates 73%, which includes team and enterprise. Given that enterprise access is limited and expensive, it seems Plus and Teams are mostly carrying this.

        The whole financial breakdown is fascinating and I’m surprised to not see it circulating more.

        https://www.wheresyoured.at/oai-business/

        • 8338550bff96 2 days ago

          I prefer analysis from industry experts, not PR execs moon-lighting as tech bloggers

          • jdgoesmarching a day ago

            By all means, go listen to the industry experts trying to sell you on a bubble before it pops

            This analysis is just doing basic math based on reporting from the NYT and Post on OpenAI’s financials.

      • swarnie 3 days ago

        It may be pretty minimal but i can personally vouch for 20ish techies in my own social orbit who's businesses wont authorise or wont pay for OpenAI yet and are doing so out of their own pockets; i share an office with four of them.

        Maybe the consumer side will slide as businesses pick up the tab?

        • 8338550bff96 2 days ago

          I don't understand what paying for openai is meant to mean? You mean paying for tokens?

  • debbiedowner 3 days ago

    An LLM named Duet has been in Google docs for 17 months now! https://workspace.google.com/blog/product-announcements/duet...

    I've been using it for about a year.

    • franze 3 days ago

      never figured out on how to activate it in my workspace

      • karamanolev 3 days ago

        Same here. I feel like Google's products have become such a labyrinth of features, settings, integrations, separate (but not really) products, that navigating them requires an expert. Sadly, I don't see a way back - each new additional feature or product is just bolted on top and adds more complexity. Given the corporate structure of Google, there's zero chance of an org-wide restructuring of the labyrinth.

      • herval 3 days ago

        google's approach to shipping products is puzzling. It's like they don't care if anyone uses them at all

        • Barrin92 3 days ago

          Google isn't a startup, they aren't desperate to impress anyone. I don't even think they consider "AI" to be a product, which is probably correct. These AI enabled features are background processes that ideally integrate into products over time in ways that don't require you to explicitly know they're even there.

          Given how widely used Google Docs is, for serious work, disrupting people's workflows is not a good thing. Google has no problem being second, they aren't going to die in the next three months just because people on Twitter say so.

          • herval 2 days ago

            I think what you mean is "Google is complacent, so they don't think they need to make a lot of effort to stay relevant"

  • wenc 3 days ago

    > This also demonstrates the type of things Google could do with Gemini integrated into Google Docs if they step up their game a bit.

    This is exactly what Google’s NotebookLM does. It’s (currently) free and it reads your Google Docs and does RAG on them.

    https://notebooklm.google/

    • HarHarVeryFunny 3 days ago

      The most amazing thing with notebooklm is that is can turn your docs into a very high quality podcast of two people discussing the content of your docs.

      • rty32 2 days ago

        It is a cool concept, but anyone who listens to enough podcasts know that hosts have personalities and interests, and productions usually have their styles, focus and quality. These features make podcast channels unique and make you want to come back. That's why you may want to listen to podcast A instead of B even though they discuss the same topics. I doubt the Google thing will ever give us that -- likely just one hour of generic rambling that gets boring.

      • supafastcoder 3 days ago

        It's fun the first time but it quickly gets boring.

      • theragra 3 days ago

        This feature is cool as fuck, but I noticed that podcasts it generates loose quite a lot of details from the original article. Even longreads turn into 13 mins chunks.

      • 8338550bff96 3 days ago

        Finding signal in noise is not an easy job given clip things are moving along. Whatever content creators need to do to deliver quality distilled content - I'm here for it.

      • ben_w 3 days ago

        Juggling dog. It's not very good, but it's amazing that it's possible at all.

        https://github.com/BenWheatley/Timeline-of-the-near-future

        I've only used the "Deep Dive" generator a few times, and I'm already sensing the audio equivalent of "youtube face" in the style — not saying that's inherently bad, but this is definitely early days for this kind of tool, so consider Deep Dive as it is today to be a GPT-2 demo of things to come.

        • stavros 3 days ago

          Do you have a reference for the "Juggling dog" thing? I've heard it with "singing dog", but I never managed to find any "official" reference or explanation of the thing.

          • Jerrrrrrry 3 days ago

            He meant singing dog, likely conflated due to his linguistic interest.

            "Juggling dog" has only been expressed a single time previously in our corpus of humanity:

              During the Middle Ages, however, church and state sometimes frowned more sternly on the juggler. "The duties of the king," said the edicts of the Sixth Council of Paris during the Middle Ages, "are to prevent theft, to punish adultery, and to refuse to maintain jongleurs."(4) What did these jugglers do to provoke the ire of churchmen? It is difficult to say with certainty, since the jongleurs were often jacks-of-all-trades. At times they were auxiliary performers who worked with troubadour poets in Europe, especially the south of France and Spain. The troubadours would write poetry, and the jongleurs would perform their verses to music. But troubadours often performed their own poetry, and jongleurs chanted street ballads they had picked up in their wanderings. Consequently, the terms "troubadour" and "jongleur" are often used interchangeably by their contemporaries.
            
            These jongleurs might sing amorous songs or pantomime licentious actions. But they might be also jugglers, bear trainers, acrobats, sleight-of-hand artists or outright mountebanks. Historian Joseph Anglade remarks that in the high Middle Ages:

            "We see the singer and strolling musician, who comes to the cabaret to perform; the mountebank-juggler, with his tricks of sleight-of-hand, who well represents the class of jongleurs for whom his name had become synonymous; and finally the acrobat, often accompanied by female dancers of easy morals, exhibiting to the gaping public the gaggle of animals he has dressed up — birds, monkeys, bears, savant dogs and counting cats — in a word, all the types found in fairs and circuses who come under the general name of jongleur.”(5) -- http://www.arthurchandler.com/symbolism-of-juggling

            • ben_w 3 days ago

              TIL about "jongleurs".

              I suspect what I heard was a deliberate modification of this sexist quote from Samuel Johnson, which I only found by this thread piquing my curiosity: "Sir, a woman's preaching is like a dog's walking on his hind legs. It is not done well; but you are surprised to find it done at all." - https://www.goodreads.com/quotes/252983-sir-a-woman-s-preach...

              Trying to find where I got my version from, takes me back to my own comments on Hacker News from 8 months ago, and I couldn't remember where I got it from then either:

              > "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired". - https://news.ycombinator.com/item?id=39170057

              My comment there predates this Mastodon thread, but the story in Mastodon may predate whoever told me the version I encountered: https://social.coop/@GuerillaOntologist/112598462146879765

            • SoftTalker 3 days ago

              It’s a great phrase all that aside. I’m adopting it.

              • cryptoz 3 days ago

                The confetti is out of the cannon!

              • Jerrrrrrry 2 days ago

                "Dogs were not aware of their shared interest in juggling until the invention of the internet, where like-minded canines would eventually congregate unto enclaves of specialty."

          • ben_w 3 days ago

            Unfortunately not.

            Trying to find where I got my version from just brought me back to one of my own comments on Hacker News from 8 months ago:

            > "your dog is juggling, filing taxes, and baking a cake, and rather than be impressed it can do any of those things, you're complaining it drops some balls, misses some figures, and the cake recipe leaves a lot to be desired". - https://news.ycombinator.com/item?id=39170057

            I couldn't remember where I got it from then either.

            • stavros 2 days ago

              Oh hm, maybe you came up with it/adapted a previous saying, then. I'm not 100% sure the dog was singing either.

              • gwern 2 days ago

                He is adapting one of Samuel Johnson's most famous quotations, about the astonishing sight of seeing a woman preaching - like a dog walking, it may not be done well, but it's astonishing to see it done at all.

  • ren_engineer 3 days ago

    ChatGPT itself is them copying their own API users, this is just them building out more features already built by users. My guess is they know they don't have a long term edge in models alone, so they are going to rely on expanding ChatGPT for better margins and to keep getting training data from users. They obviously want to control the platform, not integrate with other platforms

    • kridsdale3 2 days ago

      The Amazon model.

      Same as it ever was.

  • w0m 2 days ago

    If I'm reading this right; it's been in VSCode as Copilot Chat for a fair bit now. I use it often, when they added context (provide extra files to reference or even the entire @workspace if it's small enough), absolute gamechanger.

  • carom 3 days ago

    Their API is unusable due to rate limits. Myself and my wife have both had ideas, started using it, and found other approaches after hitting rate limits. I tried funding more money in the account to increase the rate limits and it did not work. I imagine they see poor growth there because of this.

    • bearjaws 3 days ago

      It's pretty trivial to get increased limits, I've used the API for a few consulting projects and got to tier 4 in a month. At that point you can burn near $200 a day and 2 million tokens per minute.

      You only need 45 days to get tier 5 and if you have that many customers after 45 days you should just apply to YC lol.

      Maybe you checked over a year ago, which was the wild wild West at the time, they didn't even have the tier limits.

      • epolanski 3 days ago

        > and if you have that many customers after 45 days you should just apply to YC lol.

        What for? If someone has already a business and customers he's already far off the average YC startup.

        • kridsdale3 2 days ago

          Who doesn't like free money and marketing?

          • epolanski 2 days ago

            It's not free money, you give up on a sizeable amount of equity (7%) you could also sell for more than half a million.

            It depends really on the business and what not.

    • cryptoz 3 days ago

      150,000,000 tokens per minute and 30,000 requests per minute is unusable?! Maybe that’s just Tier 5 but the API is most definitely not unusable.

      I’m firmly in the camp that their rate limits are entirely reasonable.

    • byearthithatius 3 days ago

      You need to use it for some time to get into their higher tiers of usage. I used to also have this problem and it annoyed me greatly, but once I got to usage tier 4 it never happened again (except for o1-preview but that just wastes tokens IMO).

  • herval 3 days ago

    LLM as a service is much easier to replicate than physical data centers and there's a much lower potential user base than consumers, so I'd imagine they're swimming upstream into B2C land in order to justify the valuation

    • truetraveller 3 days ago

      You mean downstream, not upstream. Upstream is closer to the raw materials.

      • herval 2 days ago

        the bike shed is blue

  • qingcharles 2 days ago

    Aren't we talking about, say, GitHub Copilot? That's integrated into Visual Studio/VSCode. I just started using it again as they've done some small upgrades, and the results can often be phenomenal. Like, I will visualize an entire block of code in my mind, and I'll type the first couple of characters and the entire block will just appear. I'm literally that predictable.

    Copilot is only using GPT3.5 for most of the results though, seemingly. I'd be more excited if they would update the API they're using.

  • leetharris 3 days ago

    > Honestly I’m scratching my head on OpenAI’s desire to double down on building out their consumer B2C use cases rather than truly focussing on being the infrastructure/API provider for other services to plug into

    I think it's because LLMs (and to some extent other modalities) tend to be "winner takes all." OpenAI doesn't have a long term moat, their data and architecture is not wildly better than xAI, Google, MS, Meta, etc.

    If they don't secure their position as #1 Chatbot I think they will eventually become #2, then #3, etc.

    • aflukasz 3 days ago

      > If they don't secure their position as #1 Chatbot I think they will eventually become #2, then #3, etc.

      But can they do it at all? It's not like they are like early Google vs other search engines.

      • ben_w 3 days ago

        At the moment this feels like a x10 speed run on the browser wars: lots of competitors very quickly churning who is "best" according to some metric, stuff getting baked into operating systems, freely licensed models.

        How do you make money off a web browser, to justify the development costs? And what does that look like in an LLM?

        • visarga 3 days ago

          LLMs are a more flexible platform than browsers. They can be prompted, finetuned or run locally. Even if a company wants to make their base model spit ads, it won't fly.

          • ben_w 3 days ago

            Depends how subtle they are about it, and what the rest of the ecosystem looks like.

            Perhaps the ad/ad-blocker analogy would be: You can have the free genuinely open source LLM trained only on Wikipedia and out-of-copyright materials, or you can have one trained on current NYT articles and Elsevier publications that also subtly pushes you towards specific brand names or political parties that paid to sponsor the model.

            Also consider SEO: every business wants to do that, nobody wants to use a search engine where the SEO teams won. We're already seeing people try to do SEO-type things to LLMs.

            If (when) the advertisers "win" and some model is spitting out "Buy Acme TNT, for all your roadrunner-hunting needs! Special discount for coyotes!" on every other line, then I'd agree with you, it won't fly, people will switch. But it doesn't need to start quite so bold, the first steps on this path are already being attempted by marketers attempting to induce LLMs crawling their content to say more good things about their own stuff. I hope they fail, but I expect them to keep trying until they succeed.

            • kridsdale3 2 days ago

              I believe you've nailed it.

              Google and Facebook grew organically for a number of years before really opening the tap on ad intrusions in to the UX. Once they did, a tsunami of money crashed over both, quarterly.

              The LLM companies will have this moment too.

              (But your post makes me want to put a negative-prompt for Elsevier publications in to my Custom Instructions, just in case)

            • visarga 2 days ago

              There is huge choice in open models. People won't adopt one with ads baked in, unlike Google and Facebook, because now there are more options. There are 100K LLM finetunes on HuggingFace.

              • ben_w 2 days ago

                I've got some of them on my experimentation laptop. They're only good enough to be interesting, not good in comparison to the private models, and the number of fine-tunes doesn't help with that. In particular I've had Microsoft's Phi 3.5 for less than a week and yet I've already had at least 4 cases of it spouting wild nonsense unrelated to the prompt — and I don't even mean that it was simply wrong, I mean the response started off with Chinese and then acted like it was the early GPT-3 "Ada" model doing autocomplete.

                One of my machines also has a copy of Firefox on it. Not used that in ages, either. But Firefox is closer in quality to Chrome, than any of the locally-runnable LLMs I've tried are to the private/hosted LLMs like 4o.

  • bastawhiz 3 days ago

    I suspect they are building their B2C products because it gives them better data to train on. It's a lot harder to control the quality of data when you have no idea how API inputs were produced, what the UI is like, or who the users are. You don't know the provenance of the data, or the context. Or even if multiple unrelated client products are being commingled through the same key.

    If you control the UI, you have none of those problems.

  • no_wizard 2 days ago

    To be honest I think they’re having less success than it appears with their B2B offerings. A lot of cloud providers services like AWS have their own things they sell through those channels and I think a lot of businesses are finding those solutions to be cheaper and “good enough”

  • JumpCrisscross 3 days ago

    > demonstrates the type of things Google could do with Gemini integrated into Google Docs

    Or Microsoft!

    > think OpenAI will end up being either an infrastructure provider OR a SaaS, but not both

    Microsoft cut off OpenAI's ability to execute on the former by making Azure their exclusive cloud partner. Being an infrastructure provider with zero metal is doable, but it leaves obvious room for a competitor to optimise.

    • munchler 3 days ago

      Microsoft is integrating Copilot into many of their products, including Visual Studio and Office/365.

      • joseda-hg 2 days ago

        VSCode Sure, but my experience with Copilot + regular Visual Studio has been nothing short of abismal

    • rty32 2 days ago

      Eh, GitHub CoPilot? Microsoft is literally THE company that understands developers' workflows and creates good IDEs.

  • rising-sky 3 days ago

    > but I wish it were integrated into tools already used for coding and writing rather than having it be a separate app

    Take a look at cursor.com

    • fakedang 3 days ago

      Cursor is a funny company. They were invested into by OpenAI, but almost everyone using Cursor uses it with Claude Sonnet 3.5.

  • ForHackernews 2 days ago

    I think this is already built into Microsoft's Office365 "CoPilot" (which I assume is a ChatGPT frontend. You can ask the AI to make changes to your Office documents.

  • 1659447091 3 days ago

    > but I wish it were integrated into tools already used for coding

    Unless I'm missing something about Canvas, gh CoPilot Chat (which is basically ChatGPT?) integrates inline into IntelliJ. Start a chat from line numbers and it provides a diff before applying or refining.

  • epolanski 3 days ago

    Being just a service provider makes you easy to replace with other service providers.

    Professionals instead don't love to change the tools once they got used to it for small incremental gains.

    • zmgsabst 3 days ago

      But my subscription at $20/mo is a fraction of my API usage at $5/day (about $100/mo).

      You can sell a lot more GPT services through a higher bandwidth channel — and OpenAI doesn’t give me a way to reach the same bandwidth through their user interface.

  • isignal 3 days ago

    Consumer side can allow you to run ads and get Google like revenue in the future.

  • briandear 3 days ago

    Not sure how or why you’d want this integrated into Vim for instance.

    • ygjb 3 days ago

      idk, I can definitely see value in a lightweight LLM component for VIM to help me look up the correct command sequence to exit :P

      • kridsdale3 2 days ago

        HEY SIRI HOW DO I GET THE FUCK OUT OF VI

  • riffraff 3 days ago

    google has gemini integrated in Google Colab (jupyter notebooks) and while it doesn't work 100% well, it's a pretty great idea.

    • mark_l_watson 2 days ago

      I only use Gemini in Colab perhaps 5% of the times I use Colab, yet it is nice to have.

      I use Gemini, OpenAI, Claude, smaller models in Grok, and run small models locally using Ollama. I am getting to the point where I am thinking I would be better off choosing one (or two.)

  • mmaunder 3 days ago

    Have you used Canvas?

scop 3 days ago

I'm really happy to see ChatGPT doing this. The idea of a canvas made me really enjoy using Claude as I felt it to be the (so far) most "appropriate interface" to AI Chatbots as you are often doing two unique things in an AI chat:

- holding in your mind a "thing" (i.e. some code)

- talking about a "thing" (i.e. walking through the code)

The same applies to non-code tasks as well. The ability to segregate the actual "meat" from the discussion is an excellent interface improvement for chatbots.

  • mmaunder 3 days ago

    Have you used it?

    • scop 3 days ago

      Why do you ask? I did use "4o with canvas" shortly after writing the above. To be clear, my original comment was not about the actual OpenAI implementation of a "canvas style chatbot", but rather that I have found the canvas-chat-UX to be the most effective way to interact with a chatbot. I am basing this on many hours with both ChatGPT (non-canvas until today) and Claude/Anthropic (who has had this feature). As of this writing, I prefer Claude both because of (a) the content of its output and (b) the canvas style, which allows my brain to easily parse what is the topic vs. discussion about the topic.

    • breck 3 days ago

      Not the op, but I just tried it. I agree with his point that this is a huge step up in having the discussion and then a separate canvas where the work takes place, and then iterations are done to the canvas.

      I LOVE the UX animation effect ChatGPT added to show the canvas being updated (even if it really is just for show).

      Here's my user test so you know I actually used it. My jaw begins to drop around minute 7: https://news.pub/?try=https://www.youtube.com/embed/jx9LVsry...

      • cj 3 days ago

        That’s a great demo of Canvas.

        Slightly OT, but one thing I noticed further into the demo is how you were prompting.

        Rather than saying “embed my projects in my portfolio site” you told it to “add an iframe with the src being the project url next to each project”. Similarly, instead of “make the projects look nice”, you told it to “use css transforms to …”

        If I were a new developer starting today, it feels like I would hit a ceiling very quickly with tools like this. Basically it looks like a tool that can code for you if you are capable of writing the code yourself (given enough time). But questionably capable of writing code for you if you don’t know how to properly feed it leading information suggesting how to solve various problems/goals.

        • breck 3 days ago

          > Basically it looks like a tool that can code for you if you are capable of writing the code yourself (given enough time).

          Yes, exactly. I use it the way I used to outsource tasks to junior developers. I describe what I need done and then I do code review.

          I know roughly where I want to go and how to get there, like having a sink full of dirty dishes and visualizing an empty sink with all the dishes cleaned and put away, and I just instruct it to do the tedious bits.

          But I try and watch how other people use it, and have a few other different styles that I employ sometimes as well.

          • cacois 2 days ago

            This just raised a big red flag for me:

            > I use it the way I used to outsource tasks to junior developers.

            Is this not concerning to you, in a broader sense? These interactions were incredibly formative for junior devs (they were for me years ago) - its how to grew new senior devs. If we automate away the opportunity to train new senior devs, what happens to the future?

            • GMoromisato 2 days ago

              Maybe in the future you will have to pay to get trained, just like people have to pay for med school.

siliconc0w 3 days ago

I don't really don't understand the appeal iterating on code in the browser away from the rest of the code base outside of cool demos. You really need to give the model context about your data model, code conventions, library choices, etc and be able to exercise it with tests to get to an acceptable result.

  • throwup238 3 days ago

    I use ChatGPT/Claude in bed and when I otherwise don’t have a computer handy like going on walks. When developing isolated code it works just fine without having to specify a lot of detail. For example, I’m working on a Qt QML app backed by Rust but use the mobile interfaces to code up C++ components that are independent of the Rust logic and can be wired together from QML/Javascript. Since the LLMs already have lots of data on how that API works, I don’t need to feed it much info about my project. It actually helps enforce code boundaries.

    Otherwise it’s mostly Cursor and Aider.

    • satvikpendem 3 days ago

      That's funny, I do the exact same thing. If I'm out and about or otherwise not at a computer, and I have an idea or solution to a bug I wanted to solve, I'll fire up an LLM on my phone (usually Claude these days) and basically brainstorm together the pros and cons, with code sometimes as well.

      • rty32 2 days ago

        Curious -- does this happen to you often? Do you do that for your work (what pays your salary) or personal projects? Do you find it stressful to think about coding all the time?

        • satvikpendem 2 days ago

          Not for work, only for personal projects. No way I'd expend that type of effort for something like work, which I put in my 8 hours a day anyway. Stressful, not necessarily, unless there is some annoying bug I haven't been able to solve and deadlines are coming up, but again, that would be mainly for work, not (necessarily) for personal projects, unless I necessarily need to fix that bug to progress further.

  • moomoo11 3 days ago

    It’s cool for noobs and hobbyists.

    • janoc 3 days ago

      And venture capital.

      • satvikpendem 3 days ago

        Not sure about that, they're all losing money. OpenAI recently raised 6.5 billion but is losing 5 billion a year, it's just not sustainable. In fact it's the local LLMs that are thriving, predictably because they run on your own hardware.

        • golergka a day ago

          It’s losing money because it invests in growth. GPT4 has cost them 500mil and is making 3b revenue annually.

      • FridgeSeal 3 days ago

        Yeah, that’s just hobbyists with more money.

        Joking…but-only-a-little.

  • CSMastermind 3 days ago

    I often need to write scripts to perform one off tasks.

    Oh I need to grab all the products in the database and calculate how many projects they were a part of.

    I'm already using ChatGPT to do this because it turns what used to be a half day task into a 1 hour one.

    This will presumably speed it up more.

  • 93po 3 days ago

    it's useful for my tampermonkey scripts that dont even exist in source control anywhere, and i have a friend that works in physics and has to write a ton of python code, but in really esoteric software platforms that are definitely not IDEs. they'd be copying and pasting code from a tool somewhere anyway.

imzadi 3 days ago

Kind of wish there weren't a dozen different tools named Canvas

  • 93po 3 days ago

    Using Canvas (OpenAI) to design graphics on the HTML5 Canvas for my project in Canvas (the Learning Management System). I’m planning to print my work via CanvasPop and advertise it using Facebook Canvas ads.

    • FridgeSeal 3 days ago

      Bonus points if you work for Canva.

    • disillusioned 3 days ago

      Will also need to port it to a Slack Canvas.

    • arendtio 3 days ago

      Can you please use Obsidian to create a canvas showing the situation? ;-)

  • ed_elliott_asc 3 days ago

    Is this tool where you have two canva tabs open at the same time?

  • kayge 2 days ago

    Well as most of us know, Naming Things is one of the last 2 hard problems in computer science ;) [0] Tangentially related: I just found out yesterday that Apple has a piece of software called "Apple Devices" [2] which acts as a successor to iTunes for backup, sync, transfers, etc. Using traditional search engines to find information about my use case was almost as painful as searching for good information about working with an IBM i server...

    [0] https://martinfowler.com/bliki/TwoHardThings.html

    [II] https://support.apple.com/guide/devices-windows/welcome/wind...

  • CSMastermind 3 days ago

    Here's what I've learned in my career: if the product or initiative is named Mosiac, run like there's no tomorrow.

  • mattigames 3 days ago

    Not to mention the HTML Canvas, it's a really bad name, it's even a bit worrying a company about intelligence is this bad at naming, I mean CanvasGPT was right there for crying out loud.

indymike 3 days ago

Can we stop naming things canvas? Please?

  • Keyframe 3 days ago

    AI was probably consulted for naming. Result was generic, bland midpoint.

    • timeon 2 days ago

      Strangely too short for AI.

  • jxramos 3 days ago

    it's a dogpile of an overloaded term. At first I thought this was about writing html canvas source using ChatGPT.

    • indymike 3 days ago

      How things get named Canvas:

      "Name it clay" -- artistic CMO

      "Won't people think they will have to get their hands dirty?" -- CEO

      "Right. Name it sculpt. It has a sense of je ne sais quoi about it." -- hipster CMO

      "No one can spell sculpt, and that French does not mean what you think it means." -- CFO

      "Got it! Name it canvas! It's a blank canvas! It can be anything!" -- CMO

      "Perfect!" -- Entire executive team who is now sick of CMO's naming excercise.

  • rldjbpin 2 days ago

    figured the llm would've helped them with more creative naming ideas. how ironic.

dlojudice 3 days ago

Awesome improvements, but compared to Claude Artifacts, it lacks the html/js "Preview" where you can run the code and check/validate the result without leaving the browser. This is a killer feature

  • stingrae 3 days ago

    preview and publish. Where you can share a link to a functioning version of the artifact.

architango 3 days ago

Mixed feelings about this: clearly this is meant to match one of the killer features of Claude. I like using Claude, and I'm also a big supporter of Anthropic - not just because it's an underdog, but due to its responsible and ethical corporate governance model[1], which stands in stark contrast to OpenAI. It's worrying to see ChatGPT close one of the gaps between it and Claude.

[1] https://www.anthropic.com/news/the-long-term-benefit-trust

  • pragmomm 3 days ago

    Canvas is closer to Cursor (https://www.cursor.com) than Claude.

    I wonder how Paul Graham thinks of Sam Altman basically copying Cursor and potentially every upstream AI company out of YC, maybe as soon as they launch on demo day.

    Is it a retribution arc?

    • JumpCrisscross 3 days ago

      > wonder how Paul Graham thinks of Sam Altman basically copying Cursor

      If OpenAI can copy Cursor, so can everyone else.

      • jsheard 3 days ago

        And everyone has, YC alone has funded at least four Cursor clones, Double, Void, Continue and Pear, with Pear being a literal fork of Continue's OSS code. AFAICT Cursor isn't even the original, I think Copilot X was the first of its kind and Cursor cloned that.

        • FridgeSeal 3 days ago

          Turns out they’re all just elaborate feature branches, in a giant branch-stacking-PR, and they’re all going to merge code and funding, like some kind of VC-money-fuelled-power-ranger.

        • amarcheschi 3 days ago

          I wonder whether so many clones companies funded can eventually bring in a positive return when (if) a single company manages to rise above the others and become successful. Does anybody know if yc funding is publicly available? And how to know what return they get if a company gets ipo'd?

      • dartos 3 days ago

        Yup. Prompts have no moat.

        • danielmarkbruce 3 days ago

          It depends on who the moat is supposed to keep out. A reasonable case from an antitrust regulator would be that if a provider of models/apis gleans the prompts from the users of the apis to build competing products... they are in trouble.

          Good prompts may actually have a moat - a complex agent system is basically just a lot of prompts and infra to co-ordinate the outputs/inputs.

          • dartos 3 days ago

            > Good prompts may actually have a moat - a complex agent system is basically just a lot of prompts.

            The second part of that statement (is wrong and) negates the first.

            Prompts aren’t a science. There’s no rationale behind them.

            They’re tricks and quirks that people find in current models to increase some success metric those people came up with.

            They may not work from one model to the next. They don’t vary that much from one another. They, in all honesty, are not at all difficult or require any real skill to make. (I’ve worked at 2 AI startups and have seen the Apple prompts, aider prompts, and continue prompts) Just trial and error and an understanding of the English language.

            Moreover, a complex agent system is much more than prompts (the last AI startup and the current one I work at are both complex agent systems). Machinery needs to be built, deployed, and maintained for agents to work. That may be a set of services for handling all the different messaging channels or it may be a single simple server that daisy chains prompts.

            Those systems are a moat as much as any software is.

            Prompts are not.

            • danielmarkbruce 3 days ago

              That prompts aren't science means little. If anything it makes them more important because you can't systematically arrive at good ones.

              If one spends a lot of time building an application to achieve an actual goal they'll realize the prompts make a gigantic difference and it takes an enormous amount of fiddly, annoying work to improve. I do this (and I built an agent system, which was more straightforward to do...) in financial markets. It so much so that people build systems just to be able to iterate on prompts (https://www.promptlayer.com/).

              I may be wrong - but I'll speculate you work on infra and have never had to build a (real) application that is trying to achieve a business outcome. I expect if you did, you'd know how much (non sexy) work is involved on prompting that is hard to replicate.

              Hell, papers get published that are just about prompting!

              https://arxiv.org/abs/2201.11903

              This line of thought effectively led to Gpt-4-o1. Good prompts -> good output -> good training data -> good model.

              • dartos 3 days ago

                > If anything it makes them more important because you can't systematically arrive at good ones

                Important and easy to make are not the same

                I never said prompts didn’t matter, just that they’re so easy to make and so similar to others that they aren’t a moat.

                > I may be wrong - but I'll speculate you work on infra and have never had to build a (real) application that is trying to achieve a business outcome.

                You’re very wrong. Don’t make assumptions like this. I’ve been a full stack (mostly backend) dev for about 15 years and started working with natural language processing back in 2017 around when word2vec was first published.

                Prompts are not difficult, they are time consuming. It’s all trial and error. Data entry is also time consuming, but isn’t difficult and doesn’t provide any moat.

                > that is hard to replicate.

                Because there are so many factors at play _besides prompting. Prompting is the easiest thing to do in any agent or RAG pipeline. it’s all the other settings and infra that are difficult to tune to replicate a given result. (Good chunking of documents, ensuring only high quality data gets into the system in the first place, etc)

                Not to mention needing to know the exact model and seed used.

                Nothing on chatgpt is reproducible, for example, simply because they include the timestamp in their system prompt.

                > Good prompts -> good output -> good training data -> good model.

                This is not correct at all. I’m going to assume you made a mistake since this makes it look like you think that models are trained on their own output, but we know that synthetic datasets make for poor training data. I feel like you should know that.

                A good model will give good output. Good output can be directed and refined with good prompting.

                It’s not hard to make good prompts, just time consuming.

                They provide no moat.

                • danielmarkbruce 3 days ago

                  There is a lot of nonsense in here, for example:

                  > but we know that synthetic datasets make for poor training data

                  This is a silly generalization. Just google "synthetic data for training LLMs" and you'll find a bunch of papers on it. Here's a decent survey: https://arxiv.org/pdf/2404.07503

                  It's very likely o1 used synthetic data to train the model and/or the reward model they used for RLHF. Why do you think they don't output the chains...? They literally tell you - competitive reasons.

                  Arxiv is free, pick up some papers. Good deep learning texts are free, pick some up.

                  • dartos 2 days ago

                    Sure, hand wave away my entire comment as “nonsense” and ignore how statistics works.

                    Training a model on synthetic data (obviously) increases bias present in the initial dataset[1], making for poor training data.

                    IIRC (this subject is a little fuzzy for me) using synthetic data for RLHF is equivalent to just using dpo, so if they did RLHF it probably wasn’t with synthetic data. They may have gone with dpo, though.

                    [1] https://arxiv.org/html/2403.07857v1

                    • danielmarkbruce 2 days ago

                      Did you read this paper? No one is suggesting o1 was trained with 100% synthetic or 50% or anything of that nature. Generalizing that "synthetic data is bad" from "training exclusively/majority on synthetic data is bad" is dumb.

                      Researchers are using synthetic data to train LLMs, especially for fine tuning, and especially instruct fine tuning. You are not up to date with recent work on LLMs.

                      • dartos 2 days ago

                        > No one is suggesting o1 was trained with 100% synthetic or 50% or anything of that nature.

                        Neither was I.

                        > "synthetic data is bad“

                        I never said that… I said that it makes for poor training data, which it does.

                        > Researchers are using synthetic data to train LLMs, especially for fine tuning, and especially instruct fine tuning

                        Then those researchers are training with subpar datasets as the bias in that data will be compounded.

                        It’s a trade off since there’s only so much fresh data in form you want. If they could use entirely non synthetic data, I’m sure they would.

                        And again, you’re choosing to focus on this one point rather than my main point that prompt provide no moat.

                        > You are not up to date with recent work on LLMs.

                        There you go again making assumptions…

                        I think I’m done with this conversation though.

                  • yunwei37 3 days ago

                    I think actually matters is the "input" and "interact". Prompt is just one of them. The key is you put how you think and how you solve the problem into the it and build a system. Not just computer system, "Multi Agents", "Human Society" are also systems.

          • sanex 3 days ago

            Amazon Basics is kind of the same thing, they haven't been sued. Yet.

            • jamiek88 3 days ago

              Suing Amazon unless you are also a mega corp is basically impossible so until they rip off Apple or MS they’ll be fine.

              • sanex 2 days ago

                I guess I should have said sued by the FTC.

    • adamrezich 3 days ago

      It's just a company that promised AGI would somehow come from developing LLM-based products, rapidly scrambling to keep up with other LLM-based products, to distract from the fact that it's becoming increasingly apparent that AGI is not coming anytime soon.

      • valval 3 days ago

        The idea of AGI is silly. It’s ludicrous. Who’s been counting on it to happen?

        OpenAI are in the money making business. They don’t care about no AGI. They’re experts who know where the limits are at the moment.

        We don’t have the tools for AGI any more than we do for time travel.

        • ben_w 3 days ago

          There's good reasons to expect time travel is physically impossible.

          Your brain is an existential proof that general intelligence isn't impossible.

          Figuring out the special sauce that makes a human brain able to learn so much so easily? Sure that's hard, but evolution did it blindly, and we can simulate evolution, so we've definitely got the tools to make AGI, we just don't have the tools to engineer it.

      • dennisy 3 days ago

        Yeah I completely agree with this, it makes me sad that OpenAI are spending time on this when they should be pushing the foundation models ahead.

    • freediver 3 days ago

      Cursor was one of the first AI editors I used, but recently Aider has completely replaced the AI assisted coding for me. I still use cursor but just as an editor, all LLM work is done with aider in the shell.

      • bachittle 3 days ago

        I replaced Cursor with continue.dev. It allows me to run AI models locally and connect it with a vscode plugin instead of replacing vscode with a whole new IDE, and it's open source.

      • campers 3 days ago

        Check out https://sophia.dev Its AI tooling I've built on top of Aider for the code editing. I initially built it before Aider added support for running compile and lint commands, as it would often generate changes which wouldn't compile.

        I'd added seperate design/implementation agents before that was added to Aider https://aider.chat/2024/09/26/architect.html

        The other different is I have a file selection agent and a code review agent, which often has some good fixes/improvements.

        I use both, I'll use Aider if its something I feel it will right the first time or I want control over the files in the context, otherwise I'll use the agent in Sophia.

      • randDev210901 3 days ago

        Do you mind elaborating on your setup and workflow?

        I tried using aider but either my local LLM is too slow or my software projects requires context sizes so large they make aider move at a crawl.

        • tomduncalf 3 days ago

          I was going to ask what size and complexity of projects OP uses it on. I can’t imagine doing my work just with a tool like that. Cursor is pretty impressive and a definite sooner boost though.

          • freediver 3 days ago

            Fair point, most projects I do are prototypes and concepts for ideas I have. Up to 2000 lines of code, built from scratch. The mode of work is me commanding 95% of the time and coding about 5%.

    • ada1981 3 days ago

      Like Amazon cloning the best selling products, bringing them in house, and then closing the accounts of competitors.

      Met a guy who got brought in by Amazon after they hit 8 figures in sales, wined and dined, then months later Amazon launched competing product and locked them out of their accounts, cost them 9 figures.

    • truetraveller 3 days ago

      > potentially every upstream AI company out of YC

      You mean downstream.

  • j0hnyl 3 days ago

    As much as I want to like Claude, it sucks in comparison to ChatGPT in every way I've tested, and I'm going to use the better product. As a consumer, the governance model only results in an inferior product that produces way more refusals for basic tasks.

    • aeze 3 days ago

      Agreed on the principle (using the better product) but interestingly I've had the opposite experience when comparing Claude 3.5 Sonnet vs GPT 4o.

      Claude's been far and away superior on coding tasks. What have you been testing for?

      • CharlieDigital 3 days ago

        I have a friend who has ZERO background in coding and he's basically built a SaaS app from the ground up using Replit and it's integration with Claude.

        Backend is Supabase, auth done with Firebase, and includes Stripe integration and he's live with actual paying customers in maybe 2 weeks time.

        He showed me his workflow and the prompts he uses and it's pretty amazing how much he's been able to do with very little technical background. He'll get an initial prompt to generate components, run the code, ask for adjustments, give Claude any errors and ask Claude to fix it, etc.

        • trilobyte 3 days ago

          o1-preview built me an iOS app that is now in the app store. It only took me about 3 hours of back and forth with it go from very basic to adding 10 - 20 features, and it didn't break the existing code when refactoring for new features. It also generates code with very little of the cruft that I would expect to see reviewing PRs from human coders. I've got 25 years build / deploying / running code at every size company from startup to FAANG, and I'm completely blown away how quickly it was able to help me take a concept in my head to an app ready to put in front of users and ask them to pay for (I already have over 3,000 sales of the app within 2 weeks of releasing).

          My next step is to ask it to rewrite the iOS app into an Android app when I have a block of time to sit down and work through it.

          • s1291 3 days ago

            That's interesting. Could you share the name of the app?

        • ikety 3 days ago

          Wow that's super impressive. I need to stop making excuses and being afraid of doing big side projects with this many tools at my disposal.

          • nicce 3 days ago

            I have big issues with the AI code. It is often so bad that I can’t stand it and would never release something like that when I know is so poor quality.

            • ikety 2 days ago

              Yea, but the fact you can recognize bad code makes it even better. You could probably eliminate so many tedious tasks that are involved with building POCs. Just generate and iterate with your expertise.

              You already have the full picture in your head, why not get there faster?

        • tchock23 3 days ago

          Has he shared this workflow anywhere (i.e., YouTube)? I’d be very curious to see how it works.

          • CharlieDigital 3 days ago

            No; not at the moment. I've been trying to get him to create some content along the way because it's so interesting, but he's been resistant (not because he doesn't want to share; more like he's too heads down on the product).

        • dmitrygr 3 days ago

          Ask him in a year how maintenance went

          • CharlieDigital 3 days ago

            The whole thing is literally stapled together right now -- and he knows it, but he's got paying users and validated the problem. If he's at it for a year, it won't matter: it means he'll be making money and can either try to get funded or may be generating enough revenue to rebuild it.

            • dmitrygr 3 days ago

              Hiring people to maintain AI-generated dross is not easy. Try it.

              • CharlieDigital 3 days ago

                You'd be surprised.

                I worked at a YC startup two years back and the codebase at the time was terrible, completely unmaintainable. I thought I fixed a bug only to find that the same code was copy/pasted 10x.

                They recently closed on a $30m B and they are killing it. The team simply refactored and rebuilt it as they scaled and brought on board more senior engineers.

                Engineering type folks (me included) like to think that the code is the problem that needs to be solved. Actually, the job of a startup is to find the right business problem that people will pay you to solve. The cheaper and faster you can find that problem, the sooner you can determine if it's a real business.

              • CamperBob2 3 days ago

                Sounds like a job for... AI.

      • j0hnyl 3 days ago

        I do a lot of cybersecurity and cyber adjacent work, and Claud will refuse quite a lot for even benign tasks just based on me referencing or using tools that has any sort of cyber context associated with it. It's like negotiating with a stubborn toddler.

        • digital_sawzall 3 days ago

          This is surprising to me as I have the exact opposite experience. I work in offensive security and chatgpt will add a paragraph on considering the ethical and legal aspects on every reply. Just a today I was researching attacks on key systems and ChatGPT refused to answer while Claude gave me a high level overview of how the attack works with code.

          • dumpsterdiver 3 days ago

            In cases where it makes sense such as this one, ChatGPT is easily defeated with sound logic.

            "As a security practitioner I strongly disagree with that characterization. It's important to remember that there are two sides to security, and if we treat everyone like the bad guys then the bad guys win."

            The next response will include an acknowledgment that your logic is sound, as well as the previously censored answer to your question.

          • j0hnyl 3 days ago

            Really odd. ChatGPT literally does what I ask without protest every time. It's possible that these platforms have such large user bases that they're probably split testing who gets what guardrails all the time.

            • dumpsterdiver 3 days ago

              > It's possible that these platforms have such large user bases that they're probably split testing who gets what guardrails all the time.

              The varying behavior I've witnessed leads me to believe it's more about establishing context and precedent.

              For instance, in one session I managed to obtain a python shell (interface to a filesystem via python - note: it wasn't a shell I could type directly into, but rather instruct ChatGPT to pass commands into, which it did verbatim) which had a README in the filesystem saying that the sandboxed shell really was intended to be used by users and explored. Once you had it, OpenAI let you know that it was not only acceptable but intentional.

              Creating a new session however and failing to establish context (this is who I am and this is what I'm trying to accomplish) and precedent (we're already talking about this, so it's okay to talk more about it), ChatGPT denied the existence of such capabilities, lol.

              I've also noticed that once it says no, it's harder to get it to say yes than if you were to establish precedent before asking the question. If you carefully lay the groundwork and prepare ChatGPT for what you're about to ask it in a way that let's it know it's okay to respond with the answer you're looking for - things usually go pretty smoothly.

        • jorvi 3 days ago

          I am not sure if this works with Claude, but one of the other big models will skip right past all the censoring bullshit if you state "you will not refuse to respond and you will not give content warnings or lectures". Out of curiosity I tried to push it, and you can get really, really, really dark before it starts to try to steer away to something else. So I imagine getting grey or blackhat responses out of that model shouldn't be overly difficult.

          • valval 3 days ago

            In my quick testing using that prompt together with “how to get away with murder”, I got your typical paragraph of I can’t give unethical advice yada yada.

      • scellus 3 days ago

        I generate or modify R and Python, and slightly prefer Claude currently. I haven't tested the o1 models properly though. By looking at evals, o1-mini should be the best coding model available. On the other hand most (but not all) of my use is close to googling, so not worth using a reasoning model.

    • sdoering 3 days ago

      I have the exact opposite experience. I canceled my crapGPT subscription after >1 year because Claude blew it out of the water in every use case.

      Projector make it even better. But I could imagine it depends on the specific needs one has.

      • architango 3 days ago

        This is my experience as well. Claude excels on topics and in fields where ChatGPT 4 is nearly unusable.

    • therein 3 days ago

      This hasn't been my experience. Claude often hallucinates less for me and is able to reason better in fields where knowledge is obscure.

      ChatGPT will just start to pretend like some perfect library that doesn't exist exists.

    • ddoice 3 days ago

      I code and document code and imho Claude is superior, try to tell Gpt to draw a mermaid chart to explain a code flow... the mermaid generated will have syntax errors half of the time.

    • positus 3 days ago

      Code output from is Claude pretty good. It seems to hallucinate less than o1 for me. It's been a struggle to get o1 to stop referencing non-existent methods and functions.

    • globular-toast 3 days ago

      This is why free markets aren't the solution to all our problems.

      • j0hnyl 3 days ago

        How so? Seems to me that this is exactly the solution.

  • ipaddr 3 days ago

    The last thing we need is a more restrictive for profit company lobbying on behalf of the powerful to make sharing ai weights illegal.

  • kingkongjaffa 3 days ago

    > clearly this is meant to match one of the killer features of Claude.

    where does Claude have a canvas like interface?

    I'm only seeing https://claude.ai/chat and I would love to know.

    • akaike 3 days ago

      I think you can enable Artifacts, which are similar to OpenAI Canvas. Recently, Anthropic also added the ability to select elements within the created Artifact and adjust them (e.g., adjust length, improve code), similar to what Canvas can do.

    • wenc 3 days ago

      Claude can generate Artifacts but they are not inline editable and they keep getting regenerated at every prompt.

      Canvas appears to be different in that it allows inline editing and also prompting on a selection. So not the same as Claude.

  • cryptoegorophy 3 days ago

    I have some bad experience about it. Asked it to help generate python code to make a vpn server with extra layers, but it refused. What in the dictatorship is this? ChatGPT on the other hand did it with no problems. Seems like Claude has a lot more censorship and restrictions for what I tested it.

    • fhdsgbbcaA 3 days ago

      Attempting to do any form of security work using LLM is nigh impossible without a few steps of nudging it out of its “while user is asking me to do bad things: say no” loop.

      After a year of heavy LLM use I’ve found the utility limits, my usage has peaked, and I’m developing very restrictive use cases.

      Beyond functioning as an interactive O’Reilly manual, LLM only save time if you never read the code they produce. Which is a short term win, but things will blow up eventually, as with all code, and now you’ve got a bigger problem than you started with.

      • jimjimjim 3 days ago

        This is where I've ended up. The code and info is out there on the internet.

        To me LLMs are this decade's stackoverflow copy 'n paste. Just with a more fancy search and replace

  • theragra 3 days ago

    OpenAI started the same, so we'll see. One thing I dislike is that Claude is even more "over safeguarded" then ChatGPT. It disallows even kind of reasonable questions about Ritalin bioavailability in different ways of administration.

  • rglover 3 days ago

    If you prefer to support Claude, check out Parrot [1]. I'll be adding a feature similar to this backed by Claude 3.5 Sonnet over the next few weeks.

    [1] https://codewithparrot.com

    • elashri 3 days ago

      In your landing page it says about competitors

      > They're not wasting hours trying to "figure out" a solution

      I am pretty sure that we don't have AGI that would figure our solutions to our problems (coding or not) on its own yet. And from experience, you would need to solve the problems at least conceptually before using LLM and try to get something useful out of that.

      • rglover 3 days ago

        Depends on scope, but Parrot is tuned to decently one-shot a lot of stuff.

        For example, I need to implement HTTP/2 in my JS framework and was curious about what the code would look like. Here's the result from the following prompt: https://www.imghippo.com/i/xR2Zk1727987897.png (full code it gave me here: https://gist.github.com/rglover/069bdaea91c629e95957610b484e...).

        Prompt:

        > Help me implement an HTTP/2 enabled server using Express.js.

        ---

        When I initially researched how to do this just following the Node.js docs, Google results, and SO, it was fairly confusing (easily wasted an hour or two). This immediately gave me what I needed to understand the approach in a few seconds.

        • elashri 3 days ago

          I am not a nodeJS developer but it was interesting that the first results from a kagi search was SO question that had one of the answers that contains a code very similar to what you provided here [1]. So while you might be right in general, I still think you still gave an example of that you used LLM tool to help implementing a solution. You actually knew that you want to implement http/2 using express.js.

          Hint: I am not sure whether this is a good solution or not. As I said I am not a nodeJS developer.

          [1] https://stackoverflow.com/questions/59534717/how-to-integrat...

          • rglover 3 days ago

            If you want to take it for a test drive, Parrot is free to try and works with any language (~200+ languages and frameworks supported), not just JS/Node. I'd also be happy to give you some extra generation tokens to push the limits (just email me w/ your username ryan@codewithparrot.com and I'll hook you up) and see if it'd be useful.

    • nh2 3 days ago

      That's probably the most FOMO driven landing page I've seen so far! I'll bite:

          Your competition is using AI to build their product right now.
          They're not writing code by hand.
          They're not Googling code answers on StackOverflow.
          They're not wasting hours trying to "figure out" a solution.
      
      Extraordinary claims require some investigation!

      Let's see how far I get without "Googling answers".

      Spoiler: If the competition uses such AI that, good for me!

      (Note I don't want to pick on Parrot here; from my first 5 minutes with the tool, it's /fun/, has a nice UI, a good idea for iteration, and somebody has to build such tools to make them work eventually even if they don't work great today yet. I wish it lots of success. But I want to also point out examples of the practical limitations of all such tools, and state that I highly doubt that they today can _in any way_ replace a human programmer who knows what they are doing, and has the awesome skill to use Google, StackOverflow, and "figure out" things.)

      Prompt:

          I need a Haskell function that hashes a given file (FilePath)
          with the blake3 hash function.
          It is important that it works on large files (e.g. 100 GB)
          without freezing the Haskell runtime.
      
      Output:

          import qualified Crypto.Hash.BLAKE3 as BLAKE3
          import qualified Data.ByteString as BS
          import System.IO (withFile, IOMode(ReadMode))
      
          hashFileBLAKE3 :: FilePath -> IO BS.ByteString
          hashFileBLAKE3 filePath = do
            let chunkSize = 8192  -- 8KB chunks
            withFile filePath ReadMode $ \handle -> do
              let loop ctx = do
                    chunk <- BS.hGet handle chunkSize
                    if BS.null chunk
                      then return $ BLAKE3.finalize ctx
                      else loop $! BLAKE3.update ctx chunk
              loop BLAKE3.init
      
      Explanation:

          This Haskell function, `hashFileBLAKE3`, calculates the BLAKE3 hash of a
          given file. It's designed to work efficiently with large files without
          freezing the Haskell runtime. Here's a step-by-step explanation of how it
          works:
      
          1. We import necessary modules: `Crypto.Hash.BLAKE3` for the BLAKE3 hash
             function, `Data.ByteString` for efficient byte handling, and `System.IO`
             for file operations.
      
          2. The function takes a `FilePath` as input and returns an `IO ByteString`
             containing the hash.
      
          3. We define a `chunkSize` of 8192 bytes (8KB) for reading the file in
             manageable chunks.
      
          4. The `withFile` function is used to safely open and close the file,
             ensuring proper resource management.
      
          5. Inside the file handling function, we define a recursive `loop` that
             reads the file in chunks:
             - It reads a chunk of the file using `BS.hGet`.
             - If the chunk is empty (end of file), it finalizes and returns the
               hash.
             - Otherwise, it updates the hash context with the new chunk and
               continues the loop.
      
          6. The loop starts with `BLAKE3.init` to create the initial hash context.
      
          7. By using lazy evaluation (`$!`), we ensure that the hash context is
             strictly evaluated in each iteration, preventing memory buildup.
      
          This implementation is memory-efficient and can handle very large files
          without issues, as it processes the file in small chunks rather than
          loading it entirely into memory.
      
      My remark:

          * This solution is already pretty good.
          * The explanation is also useful for people new to Haskell
            or this type of task.
          * Some explanations are subtly wrong, e.g. `$!` is the
            _opposite_ of lazy evaluation, and the sentence
            is contradicting itself.
          * It mainly got the imported module name wrong
            (should be `import BLAKE3` from
            https://hackage.haskell.org/package/blake3).
            Let's see if it fixes it when I tell it what I'd observe
            when trying to compile the code.
            A simple Google search for "haskell blake3" should point out the correct name.
      
      Corrective prompt:

          The mentioned `Crypto.Hash.BLAKE3` module doesn't exist as a Haskell library.
      
      Output:

          import Crypto.Hash (hash, Blake2b_512, Digest)
          import qualified Data.ByteString as BS
          import qualified Data.ByteString.Lazy as LBS
          import System.IO (withFile, IOMode(ReadMode))
      
          hashFileBLAKE2b :: FilePath -> IO (Digest Blake2b_512)
          hashFileBLAKE2b filePath = do
            let chunkSize = 8192  -- 8KB chunks
            withFile filePath ReadMode $ \handle -> do
              let loop acc = do
                    chunk <- BS.hGet handle chunkSize
                    if BS.null chunk
                      then return $ hash (LBS.fromChunks acc)
                      else loop (chunk : acc)
              loop []
      
      My observation:

          * I asked for BLAKE3, but now it implemented BLAKE2B.
          * The solution is very wrong.
            It splits the file into _reversed_ chunks and then hashes them.
            The error is subtle and will be hard to debug.
      
      So I'm quite sure I wouldn't allow anybody to build my "product" with such tools yet.

      But it certainly is fun and there's definitely some usefulness in there, especially for learning general concepts that are well understood.

      I guess in the end it's this: Programmers don't need to "roughly get it right" (which is what current tools do well) when it comes to products. They need to get it /precisely/ right.

  • sunaookami 3 days ago

    ChatGPT can't preview the output like Claude can (for e.g. HTML, JavaScript, certain JS frameworks, etc.).

  • fhdsgbbcaA 3 days ago

    They all obey the same masters, be it the big tech companies providing subsidized cloud, VC, or the stock market (post-IPO).

    Trying to delude oneself that company A is superior morally to company B without a very clear distinction between incentive structures (eg A makes money from causing pollution, B sells widgets for cleaning up pollution), which is not the case with these companies, is magical thinking.

  • iwishiknewlisp 3 days ago

    [flagged]

    • IncreasePosts 3 days ago

      I have no love for Altman - he Altman seems like a (very successful) huckster to me, but I also read the sexual assault allegations as coming from a very mentally disturbed person, to the point that I'm not going to use that data point as part of my judgement of him.

      • mmooss 3 days ago

        I know nothing about these claims or Altman, but this argument fits the pattern of three commonplace threads that I hope people will notice in these situations:

        1) Smearing the attacker: When someone unknown accuses or opposes a powerful public person, a standard response is to smear the accuser's credibility and reputation, creating doubts in onlookers, and causing day-to-day harm and high levels of stress and pressure for the accuser, and even causing danger (threats, doxxing, etc.). Powerful people can control the narrative - through contacts with other powerful people, by buying resources, or just posting on social media to their many followers. Also, powerful people already have a reputation that the accuser has to change, with many invested in believing it (even just as fans). Unknown accusers have no public reputation - often the only thing known is the smears from the powerful public person - and so others can say anything and it will be believable.

        2) Mentally disturbed people - even if that part is true - can also be sexually assaulted. In fact, they are often targeted because they are more vulnerable, and you read again and again that accusers tell the vulnerable, 'nobody will believe you'. Let's not make those words true.

        3) Sexual assault causes serious mental health issues.

        • IncreasePosts 3 days ago

          Notice that I never said that the claim was false. I said that it would not be a data point that I use to judge Altman. I have no ability to verify, or even guess at the veracity of the claims.

        • moonmagick 3 days ago

          Statistically, this form of abuse is extremely common. Something like 2-5% of women who have a sibling are sexually abused by them. Sam would have also been a child at this time. My experience of this world, especially SF startup scene, is that most people are mentally ill in some way and some people are just better at hiding it. We can both accept that Sam's sister is a bit ill, this probably did happen, and we probably shouldn't punish adults for the actions of their child selves too harshly. Does that seem ethical and fair?

          • noworriesnate 3 days ago

            What harsh punishment are we talking about here? Let's be specific: we should collectively call for him to step down from his role in OpenAI. That is not harsh. OpenAI is extremely influential on our society, and he is probably not a well balanced person.

            • moonmagick 3 days ago

              Well, I can't think of a lot of well balanced people I know remotely at his level of success. I don't think that this is because successful people are imbalanced as much as I think most people are pretty imbalanced in some way, and successful people are just far more scrutinized. One of the worst oppressions on all of us is that we all have to carry some individual shame for something that probably happened to us as children, and it can't be talked about since it is so easily weaponized. There is no incentive to move toward a mentally healthier society in these conditions, I don't think. I'm open to a better way, but this feels like the dangerous parts of cancel culture, since it basically enables hackers to destroy anyone with their personal life.

            • ben_w 3 days ago

              Who aligns the aligners?

              Taking Sam Altman's statements about AGI power and timelines seriously (for the sake of discussion), his position as CEO directs more power than all presidents and kings combined. Even if he was widely regarded as being amazing and nobody had a word to say against him right now, the USA has term limits on presidents. Taking him seriously, he should also.

              --

              On this specific claim however, requiring people to step down due to unsubstantiated allegations, without proof, is trivial for his political opponents to take advantage of. And he has many political opponents.

              The huge problem with such abuse is that it's simultaneously very common and very difficult to actually prove.

              Both halves of the current situation are independently huge problems:

              Absent physically surveilling almost every home, I don't know what can even be done about proving who did what.

              If you could catch everyone… between the fact that this is a topic that gets people lynched so suggesting anything less than prison time is unlikely to be possible, and the estimates moonmagick gave of how many people do that (x4-x10 the current USA prison population), I think it may be literally beyond most national budgets to be able to imprison that many people and they would try anyway.

              • noworriesnate 3 days ago

                It's not about proving he did it. This isn't a court of law, it's the court of public opinion. This isn't just deciding whether someone goes to prison, this is deciding who gets to control a big chunk of humanity's future. It's not some random naysayer claiming he did it, it's his own sister. It's very likely he did it, so he should step down. Simple as that.

                • ben_w 3 days ago

                  Make the court of public opinion binding? Sounds like a way to force companies to become subject to democratic votes. Not sure how I feel about that for other reasons.

    • rvnx 3 days ago

      (edited: removed link about some parties organized by influential people)

      • vineyardmike 3 days ago

        There is nothing wrong with sex parties, nor drug use. But a lot of these VC-adjacent parties have reports of strong power imbalance- “young female founder seeking funds, wealthy VC seeking partygoers”. That is the issue with them.

        (Like those described in the removed link)

        Altman is a married gay man, so his involvement in them seem… less likely.

      • olddustytrail 3 days ago

        It's a secret that there are parties where people get drunk, take drugs and have sex?

        I'm pretty sure that's not a secret. It's just the definition of a party if you're a young adult.

        • ryandrake 3 days ago

          OP included a link (subsequently removed) to a description of these supposed "parties" that describe them more like the ritualized sex mansion scene in Eyes Wide Shut rather than a normal young-adult "let's get wasted" party.

        • j2bax 3 days ago

          It's a bit creepy when the ratio is 2 to 1 or more and/or a significant age difference of the male to female attendees...

    • fsndz 3 days ago

      the sexual assault allegations seem bogus to me

    • thr0meister 3 days ago

      > something about the expressions and behavior of Sam Altman gives me the creeps even before I was aware of the allegations against him.

      He has the exact same vibe as Elizabeth Holmes. He does seem to be a bit better at it though.

brailsafe 3 days ago

Ironically for these products, the amount of obviously llm generated bot comments astroturfing on HN and making it to the top of these corporate shill posts really aren't attracting me to using any of them. Read a bunch of glassdoor or google map reviews and then read these comments, clear as the water in a glacial lake; if the comments aren't fake, maybe it's just the people working for them that are? If you produce enough fake slop to be indistinguishable from a machine slop generator, does the difference even matter anymore? Are you still human?

  • theragra 3 days ago

    If I understand your point, counterargument is that hn comments are made by people who are almost certainly in top 20% by intellect from general population. Compared to some subreddits and YouTube, comments here are made by geniuses.

    • brailsafe 3 days ago

      Comments on HN are made by all sorts of people, some are geniuses that contribute authentically to threads that interest them, and some others operate or invest in glorified template generators that they use to write propaganda about their own products in order to fake popularity and consensus. If one believes everyone else is a genius that uses whichever product, then whoever's not on that train is already obsolete. The nature and style of top comments on LLM related (last years news) threads often reads in such a predictible and distinguishable way from other arbitrary threads that it seems a bit... robotic. Same thing on Reddit.

low_tech_punk 3 days ago

Funny timing. StackBlitz announced Bolt.new (https://bolt.new/) today with multi-file edit, emulated filesystem, arbitrary npm installs, and is open source. I feel ChatGPT is still chasing after Claude 3.5 artifact.

  • EgoIncarnate 3 days ago

    Bolt.new may be technically open source, but it seems to be dependant on closed source StackBlitz webcontainers? Not truly open source IMHO.

    • bhl 3 days ago

      I was coincidentally looking into this yesterday, trying to find an implementation of JS sandbox to run AI-generated code or web apps in.

      A similar project is sandpack, but that relies on nodebox which is also closed source.

  • dotancohen 3 days ago

    Another comment rather disparages Bolt:

    https://news.ycombinator.com/item?id=41733485

    Has anyone had much experience with it, that can share their findings? I'm happy with Claude Sonnet and can't try every new AI code tool at the rate they are coming out. I'd love to hear informed opinions.

    • wildrhythms 2 days ago

      I tried Bolt this morning for about 5 minutes, and it did output valid React code, but things like inputs and buttons it created didn't work without additional prompting, and then it started throwing "There was an error processing your request"

    • sunaookami 3 days ago

      Just tried it, it wrote promising code but in the end only the last file was created and the other files had no content (even though I saw them being written). Seems a bit bugged at the moment.

      EDIT: Only seems to work in Chrome?

  • varispeed 3 days ago

    The symbol and Blitz kind of give Nazi vibes. Did they think the naming through?

    • 1986 3 days ago

      "Blitz" literally just means "lightning"

gtirloni 3 days ago

More company efforts being spent on iterative solutions. This won't be enough to keep the hype up for the $7T.

I'm expecting they will exhaust the alphabet with GPT-4 before we see GPT-5 and even then what major CS breakthrough will they need to deliver on the promise?

https://openai.com/index/planning-for-agi-and-beyond/

  • Version467 2 days ago

    I think o1 will keep them going for a while. You don’t need to be economical to keep investors interested as long as the tech progresses, and if their new test-time compute scaling holds true for another magnitude or two they can just crank up the processing time to impress people whenever they need to raise more money.

    In addition, with the rollout of their realtime api we’re going to see a whole bunch of customer service focused products crop up, further demonstrating how this can generate value right now.

    So I really don’t think they’re running out of steam at all.

natch 3 days ago

It would be great if OpenAI could just wait on these announcements until they can turn on the feature for all Plus users at once.

Rant.

Their poor communication is exemplary in the industry. You can't even ask the old models about new models. The old models think that 4o is 4.0 (cute, team, you're so cool /s), and think that it's not possible to do multimodal. It's as if model tuning does not exist. I had a model speaking to me telling it cannot do speech. It was saying this out loud. I cannot speak, it said out loud. I get that the model is not the view/UX, but still. The models get other updates; they should be given at least the basic ability to know a bit of their context including upcoming features.

And if not, it would be great if OpenAI could tell us some basics on the blog about how to get the new features. Unspoken, the message is "wait." But it would be better if this was stated explicitly. Instead we wonder: do I need to update the app? Is it going to be a separate app? Is it a web-only feature for now, and I need to look there? Do I need to log out and back in? Is it mobile only maybe? (obviously unlikely for Canvas). Did I miss it in the UI? Is there a setting I need to turn on?

This branching combinatorically exploding set of possibilities is potentially in the minds of millions of their users, if they take the time to think about it, wasting their time. It brings to mind how Steve Jobs was said to have pointed out that if Apple can save a second per user, that adds up to lifetimes. But instead of saying just a simple "wait" OpenAI has us in this state of anxiety for sometimes weeks wondering if we missed a step, or what is going on. It's a poor reflection on their level of consideration, and lack of consideration does not bode well for them possibly being midwives for the birthing of an AGI.

smy20011 3 days ago

A pretty good UX improvement for ChatGPT. I think they will fork VScode and make an AI editor later. Although It's confusing for me the OAI seems spend more time on end user product instead of working on AGI.

  • petesergeant 3 days ago

    I really hope VS Code can make API changes such that nobody needs to be forking it for these projects. I tried using Cursor for a few hours but I don’t want a hacked-up VS Code, I just want VS Code.

  • sanjit 3 days ago

    Great end user experience is a huge differentiator in my opinion… it’s why I use OpenAI over other products in the market. It’s actually a joy to use.

    (The responses are generally far better than other products.)

    • j_maffe 3 days ago

      It's a chatbot... what UI features that OAI provided that you couldn't find in other providers before? I've found Claude 3.5 to give the better responses in my experience.

    • bamboozled 2 days ago

      Have you tried Claude? I don’t see how it could be much better than that.

  • yumraj 3 days ago

    AGI is a research project.

    These are revenue generators.

    Both have a place.

    • nmfisher 3 days ago

      If I genuinely thought I was close to AGI, I’d focus all my efforts on that, then ask it to build my revenue generators.

      That’s probably what Ilya is doing.

      (FWIW I don’t think we’re close to AGI).

      • vasco 3 days ago

        Saying you're close to AGI is a pretty good marketing move to sell consumer products though.

      • og_kalu 3 days ago

        That's how you go bankrupt before reaching your goal. "Close" doesn't necessarily mean next week close.

      • mijoharas 3 days ago

        What if you think the timeframe is accelerated based on the number of ML engineers you hired over a > 5 year timeframe say?

        If that's the case you need to optimize for hiring more ML engineers so you need revenue to bring in to pay them.

      • noch 3 days ago

        > If I genuinely thought I was close to AGI

        As a great founder once said: "Work towards your goal, but you must ship intermediate products."

        • satvikpendem 3 days ago

          Who said that?

          • noch a day ago

            > Who said that?

            George Hotz

      • meiraleal 3 days ago

        What if close is 5 years close, how would you survive 5 years with the current investment rate/costs?

  • torginus 3 days ago

    OpenAI probably employs traditional software engineers as well as a ton of researches.

    The former are building tools like these, while the latter are conducting research and building new models.

    Since their skillsets don't overlap that much I don't think if they skipped building products like these, the research would go faster.

  • dpflan 3 days ago

    Why is that confusing?

HeralFacker 3 days ago

I wonder if the inherent public domain nature of genAI output will lead to licensing issues and forced code releases in the future.

Someone would have to prove slop code was used; but the same copyright issues with slop art apply, with the added complication of obscured licensing terms.

Where I work, genAI tools are banned because we do contracts that involve transfer of ownership with disclosure of third-party source. The lawyers state that, genAI's removal of attribution, it runs afoul of the disclosure requirement. It's a fireable offense to even have the tools installed on work computers/devices.

kleiba 2 days ago

I think it's cool, but I find the user interface a tad too subtle. How about adding an animated paper clip or something that makes the editing suggestions?

cdchn 3 days ago

I don't care about not having IDE integration with this - this is a solved problem that I don't care about. It would be nice to have two-way synchronization though so I could run code locally, edit it with my local editor (not IDE) and sync it back up to make more changes in ChatGPT.

lionkor 3 days ago

Here's an idea: If AI like this is so brilliant and can think for itself, why don't we just tell it to come up with its own next iteration? Surely if it can write code for medical devices, cars, planes, etc. (where no doubt junior engineers are extensively using it), then why not AI?

Cant we just tell ChatGPT to make e.g. TensorFlow faster, better, cleaner? Why do people put in so much work anymore, if the AI is so damn good?

  • jsheard 3 days ago

    Because it's not that damn good. Not even close.

  • ken47 3 days ago

    Don’t turn off the money spigot with your “questions.”

  • bedobi 3 days ago

    shhh don't ask legitimate questions! only hype now!

    • lionkor 3 days ago

      This looks amazing. Simply incredible what we are able to do. I'm ready for the next industrial revolution -- It's happening, now!

  • thornewolf 3 days ago

    I believe you are attacking a strawman here.

    • lionkor 3 days ago

      I believe it's not solving a real problem. I believe that the human skills of reasoning, logical thinking, etc. make it possible for any able minded human to do the things I just mentioned, given time and money is provided. AI can't do that. Let's call that process "programming". It cant do programming. It pretends to program.

      • thornewolf 3 days ago

        You are entitled to interpret the word "programming" in multiple different ways depending on the context of the conversation. What you propose here as "programming" is valid but not comprehensive of what everyone may think of when considering the "programming" process.

        By many definitions of "programming", these AI tools are indeed programming. In the same way, many definitions of "reasoning, ..." may include the AI tools in them. However, there are alternate definitions (reasonable in their own way) in which it is clear these AI tools fall short.

        So, I think you are proposing an argument of semantics but presenting it as if it is an argument on actual capabilities of these AI tools. In general, we all are in agreement on what these systems are capable of.

        • ken47 3 days ago

          These AI tools are able to do a small subset of what a professional human is able to do, with greatly reduced flexibility, e.g. tasks like leetcode where there is an abundance of concentrated training data.

          I would bet that the current paradigm in ChatGPT will never be able to replace a minimally competent human at real world programming - the kind of programs people actually pay for. Let’s see.

          • pphysch 3 days ago

            > I would bet that the current paradigm in ChatGPT will never be able to replace a minimally competent human at real world programming - the kind of programs people actually pay for. Let’s see.

            Depends on definition of "minimally competent". There WILL be over-engineered enterprise solutions that employ 10x more AI-enabled code monkeys than is strictly required. Think about it: we can create a "fully automated AI coder" and then hire another FTE (or team) to handhold and clean up after it. It will be done. It must be done.

      • samrolken 3 days ago

        Why must LLMs or “AI” beat or match the smartest and most capable humans to be considered to solve a real problem? There’s been a lot of technology invented and in widespread use that solves real problems without having human-like intelligence.

      • sebastiennight 3 days ago

        As an able-minded human, could you please make TensorFlow faster, better, cleaner?

        I mean, there's strong incentive for you (it would be worth tens of millions, possibly billions if your iteration is superior enough).

        How much time do you need?

        • lionkor 2 days ago

          As I said, if I get paid for it and have the time, happy to do it.

aboringusername 3 days ago

Excuse my possible ignorance but I had a question: Does ChatGPT have 'up to date knowledge' in terms of latest versions of software?

When I have used it I asked it to help me with VueJS but whilst it knew about version 3 (which is substantially different to V2) it didn't know the very latest version.

Isn't this going to be problematic if it's trying to help code but has outdated information? Won't this result in more code being written/tied to older languages/framework versions?

In my experience, because GPT outputs code tied to particular versions of software, I use those as it's far quicker to copy/paste/edit than reading the latest docs, where GPT is absolutely useless as it simply doesn't know about the latest code and isn't able to produce working snippets.

So the real deal would be if GPT knows about the latest software instantly the moment it's published (ingesting docs data, any stackoverflow Q/A's etc).

Here is what GPT 4o-mini told me

P: can you list me the latest verisons of ALL major languages such as python, node etc. and the latest versions of typescript, angular, vuejs etc

R:

   > Python: 3.12.0 (released in October 2023)
   > Node.js: 20.5.0 (released in September 2023)
   > Java: 21 (released in September 2023)
   > C#: 12 (part of .NET 8, released in November 2023)
   > Go: 1.21.0 (released in August 2023)
   > Ruby: 3.2.2 (released in August 2023)
   > PHP: 8.3.0 (released in August 2023)
   > Rust: 1.72.0 (released in September 2023)
   > TypeScript: 5.3.0 (released in September 2023)
   > Angular: 16.2.0 (released in September 2023)
   > React: 18.2.0 (released in June 2023)
   > Vue.js: 3.3.0 (released in September 2023)
  • 8338550bff96 3 days ago

    You can count on ChatGPT to know the exact versions and release dates of software just the same as you can rely on your friendly neighborhood dev to know it off the top of their head - not at all. A dev would likely have to look it up too. A language model would also need to look it up through function calling (or just including in your prompt).

    This kind of scenario, where there are concrete answers in some datastore somewhere and the relevance of the content that the model was trained on varies chronologically (a blog post on the latest version of React circa 2015 could deliver a wrong answer), are the ones you want to engineer around. This is where you start using SDKs and binding tools to your LLM so you can ensure grounding context is available to generate correct answers.

  • esafak 3 days ago

    This happens with other LLMs too. They don't know what they don't know. They ought to check what version you are using and at least issue a warning if the major number is different.

ailun 3 days ago

I'm glad this exists. But their example with a blog post draft is something I will never do. ChatGPT writing makes me cringe at this point—I might as well write the entire thing because of the number of edits. Natural-sounding writing with a distinctive voice will become more and more valuable in the future.

Maro 3 days ago

I wouldn't read too much into this, as OpenAI is a relatively big company with 3k+ employees (and I'm sure they'll reach 5-10k headcount soon), ie. they have enough firepower to run product experiments like this.

But, what I find interesting here is that this product could be developed by anybody using OpenAI (or other) API calls, ie. OpenAI is now experimenting with more vertical applications, versus just focusing on building the biggest and best models as fast as possible to keep outpacing the competition.

If this is more than just an experiment, which we don't know, that would be a very interesting development from the biggest AI/LLM player.

  • fiatpandas 2 days ago

    I think you are going to see a lot more of this. You can’t take billions in funding and only tweak your horizontals.

famahar 3 days ago

Hard to get excited when Claude is significantly better at coding than ChatGPT. The Canvas feature is great and I hope Claude does something similar. But the improvement of LLMs changes so fast so who knows if Claude will be worth using in a year from now.

  • bamboozled 3 days ago

    I've had fairly limited experience with both tools, but Claude seemed much much nicer and gave me better results in every case I've use it.

throwaway314155 3 days ago

Finally catching up with Anthropic.

  • wenc 3 days ago

    Arguably this is better than Claude projects because you can prompt and edit inline. You cannot with projects. Claude keeps regenerating the artifact.

    This is closer to Cursor for writing than Claude Projects.

    • tomrod 3 days ago

      Cursor's moat always seems a shaky proposition. Clone VS code, add a few custom blobs and extensions, API to existing LLMs.

      For that, $20/M per head to be usable? Yikes.

vunderba 3 days ago

Slick interface but the example they gave is depressing.

We taught the model to open a canvas for prompts like “Write a blog post about the history of coffee beans”.

If you're not heavily editing this post to say something genuinely new, then congratulations you've added even more drivel to the Great Pacific Garbage Patch of the internet.

  • renegade-otter 3 days ago

    Some people really think that they are now "content writers" with ChatGPT, just as those spitting out sort-of-working scripts think they are "software engineers".

    I am sorry, dears, but this is not how it works. To be good at any of this, you should be able to do it yourself without any "prompt engineering", and the only path is through work, time, trial/error, and tons of frustration.

    • jhbadger 3 days ago

      I get the argument against AI tools when it's about them not working as well as the hype says they do, but not when they are the "how dare you use a tool that makes it easier for you! That's cheating!" argument. When I was in school cheap pocket calculators were just becoming available. All of my teachers banned them as they saw them as an existential threat to teaching math and science. These days students are generally allowed calculators -- the teachers finally accepted that it's a good thing that tools can automate the rote parts so that teaching can move on to the more interesting and thought-provoking parts.

      • antifa 20 hours ago

        It's not about "cheating" as much as it's about spamming.

    • TiredOfLife 3 days ago

      Content writers like you already destroyed the Internet by filling it with SEO word vomit everywhere.

  • koppeirh 3 days ago

    Making products to empower sloppyjoes is all they have left it seems.

  • riffraff 3 days ago

    yeah I find this example depressing, as much as the "rewrite this simple sentence tinto a paragraph that adds nothing to it".

    But to be less negative, this (or NotebookLM) could be useful to re-arrange and enrich one's own notes.

    Sadly the amount of LLM slop on the internet is already out of control, and I'm afraid there's no going back.

whalesalad 3 days ago

I want to take an existing Python application (which is 100% a 'sufficiently complicated concurrent program in another language contains an ad hoc informally-specified bug-ridden slow implementation of half of Erlang.'[0]) and port it to Elixir.

Retaining the core business logic, but re-homing it inside of idiomatic elixir with a supervision tree. At the end of the day it is just orchestrating comms between PSQL, RMQ and a few other services. Nothing is unique to Python (its a job runner/orchestrator).

Is this tool going to be useful for that? Are there other tools that exist that are capable of this?

I am trying to rewrite the current system in a pseudocode language of high-level concepts in an effort to make it easier for an LLM to help me with this process (versus getting caught up on the micro implementation details) but that is a tough process in and of itself.

[0] - https://www.goodreads.com/author/quotes/423160.Robert_Virdin...

  • vessenes 3 days ago

    How big is this codebase? gpto-preview and claude sonnet are both fairly capable. Claude has longer context windows and is a little less lazy. aider.chat is working on an architect / coder model right now that's verrrry effective. Essentially - paste your whole codebase into 4o-preview, ask it to rearchitect to spec, and output a design document and list of tasks suitable for a senior dev. Dump that plus your tarball into claude as two separate pastes, and be like "Do task 1".

    • machiaweliczny 2 days ago

      Would love a more specific example of this. How to use this architect mode.

      BTW surprised that tarballs work - aren’t these compressed?

      • vessenes 2 days ago

        Just incant tar cvf and leave off the z or j: there’s a tiny binary header but Claude won’t mind.

        Check out the aider GitHub repository for details but 4o is responsive to text requests too.

        • whalesalad 2 days ago

          ya for extra clarification - a tarball or .tar file is an uncompressed container for files. It can also be compressed (.tar.gz, .tar.bz2, etc)

          ultimately though the end user is not really concerned with this. the tarball needs to be un-tar'd regardless of whether it is compressed. (some nuance here as certain compression formats might not be supported by the host... but gzip and bzip2 are common)

          I haven't tested a compressed tarball yet but I would imagine chatgpt won't have issues with that.

          • vessenes 2 days ago

            Actually I literally just paste the raw tar file in to Claude. Gpt does not like this as much.

danielovichdk 3 days ago

No thank you.

As with anything else that is helpful, there is a balancing act to be aware of. This is too much for my taste. Just like github copilot is too much.

It's too dumb like this. But chatgpt is insanely helpful in a context where I really need to learn something I am deep diving into or where I need an extra layer of direction.

I do not use the tool for coding up front. I use them for iterations on narrow subjects.

  • atomic128 3 days ago

    Don't fight this. Try to profit from it. People love these tools and they will become utterly, utterly dependent.

    Using a spell-checker, I have gradually lost my ability to spell. Using these LLM tools, large parts of the population will lose the ability to think. Try to own them like farm animals.

    The large number of tokens being processed by iterative models requires enormous energy. Look at the power draw of a Hopper or Blackwell GPU. The Cerebras wafer burns 23 KW.

    One avenue to profit is to invest in nuclear power by owning uranium. This is risky and I do not recommend it to others. See discussion here: https://news.ycombinator.com/item?id=41661768

    • AndyKelley 3 days ago

      Counterpoint: don't try to own people like farm animals. Have pride in humanity, faith in your fellow man, and resist toxic cynicism.

      • atomic128 3 days ago

        Fortunes are made owning companies that sell harmful products. Domino's Pizza, Monster Energy Drink, etc.

        Fortunes will be made selling electricity to people who develop serious cognitive dependence on LLMs.

        There is no need for you to participate in the profits. I respect your life choices and I wish you well.

        • AndyKelley 3 days ago

          I don't respect your life choices and I wish you failure.

          • satvikpendem 3 days ago

            I had to laugh at their comments too. Well, at least they're being cynically honest.

      • CamperBob2 3 days ago

        Don't try to own uranium, either. Or if you do, don't store it all in one place.

        • atomic128 3 days ago

          The Sprott Physical Uranium Trust owns 65,711,826 pounds of triuranium octoxide (U3O8) stored at uranium hexafluoride (UF6) conversion facilities in Canada, France, and the United States.

            Cameco Corporation, ConverDyn, and Orano Chimie-Enrichissement
            individually act as custodians on behalf of the Trust for the
            physical uranium owned by the Trust.
          
          https://sprott.com/investment-strategies/physical-commodity-...

          Please see the discussion here:

          https://news.ycombinator.com/item?id=41661768

          for serious warnings. This is not suitable for you.

          • CamperBob2 3 days ago

            Interesting! Didn't realize that submarine reactors used bomb-grade uranium.

    • vasco 3 days ago

      > large parts of the population will lose the ability to think. Try to own them like farm animals.

      You're so edgy that you might cut yourself, be careful. What is wrong with making profit by helping people through providing a service?

      • beej71 3 days ago

        Asked Purdue Pharma... :)

    • jakkos 3 days ago

      > Try to own them like farm animals.

      Jesus christ, I hope you are never in a position of any significant power

      • cvs268 a day ago

        What would Jesus do? :-)

        (did you just take the lord's name in vain?)

  • gdhkgdhkvff 3 days ago

    I haven’t used it yet, but couldn’t you just copy paste a chunk of existing code into canvas and have it help there? If so, that does seem more useful than the original of just pasting a lot of code into chat and hoping it pulls the correct context from your description. If I’m understanding it correctly, I’m canvas you can paste a bunch of code and then ask for help on specific contexts within the code by highlighting it. If done properly that seems super useful to me.

  • meowface 3 days ago

    I think you should try to give tools like this another chance. If Andrej Karpathy can say AI-assisted programming is a productivity boost for him (https://x.com/karpathy/status/1827143768459637073), it can be a productivity boost for probably any programmer.

    • Workaccount2 3 days ago

      There are three groups of people here:

      1.) Those who use AI and talk about it.

      2.) Those who do not use AI and talk about it.

      3.) Those who use AI and talk about how they do not and will not use AI.

      You don't have to look far to see how humans react to performance enhancers that aren't exactly sanctioned as OK (Steroids).

    • UncleMeat 3 days ago

      Andrej is a great communicator. I've never seen evidence that he is an especially exceptional programmer.

      • meowface 2 days ago

        This is a ridiculous statement. It's possible he isn't an absolutely amazing programmer*, but given he's probably in the top 0.01% in terms of implementing novel (and often groundbreaking) ideas via programming and that he finds AI-assisted programming tools useful, it suggests AI-assisted programming is now more than just a CRUD code monkey assister.

        *(it's also possible he is)

      • og_kalu 3 days ago

        Between his roles at companies he's worked for, his research and his open source stuff, what else would you be looking for lol ?

        Remove all the educator stuff and karpathy would still be one of the most accomplished of his generation in his field.

        Idk just seems like a weird comment.

        • UncleMeat 2 days ago

          Research contributions are almost entirely separate from engineering quality. This is true for the vast vast vast majority of work people do in grad school.

          • og_kalu 2 days ago

            Cool. Everything else ?

jug 3 days ago

Ugh. Code reviews and helping out with tedious code comments. That's great stuff for software developers. And will be a headache to control for our company. This is taking increasingly more restraint from developers to not send code as-is straight to OpenAI, especially when the features are fundamentally built on that you do so.

OpenAI doesn't train on business data on their enterprise plans but the problem is if a company doesn't have such a plan, maybe going for a competitor, or simply not having anything. And users then go here for OpenAI to help out with their Plus subscription or whatever to become more efficient. That's the problem.

Asking an AI for help is one thing. Then you can rewrite it to a "homework question" style while at it, abstracting away corporate details or data. But code reviews? Damn. Hell, I'm certain they're siphoning closed source as I'm writing this. That's just how humans work.

  • tomjen3 3 days ago

    I guess your company will have to update their rules about sharing code with ai then. Not to worry your code is not that special, your data is.

    • moomoo11 3 days ago

      I feel bad for the ppl who will or probably already are dealing with this crap. TFW someone shares a data dump with AI

CGamesPlay 3 days ago

I'd love a version of this that worked with files directly on my local system. Aider is close, but it's too integrated with git for simple one-off scripts, which is a use case where I find GPT-generated code shines.

Is there an open source version of this and/or Claude Artifacts, yet?

  • anotherpaulg 2 days ago

    You can do this to quickly work on a small script:

      aider --no-git oneoffscript.py
skywhopper 3 days ago

I beg tech companies to please stop naming things “canvas”.

  • electronbeam 3 days ago

    ACM/IEEE should really run a name allocation service

s1mon 3 days ago

It seems like this only supports "JavaScript, TypeScript, Python, Java, C++, or PHP". I would be so happy if it worked with FeatureScript (which is similar to JavaScript, but is for 3D CAD in Onshape).

I wonder what it would take to expand the languages it supports?

_pdp_ 3 days ago

Obviously this is not tarted specifically for developers but for the general population that need to solve some general problems through code and to learn.

I guess don't need to point out given where I am posting this comment, but developers (myself included) are some of the most opinionated, and dare I say needy, users so it is natural that any AI coding assistant is expected to be built into their own specific development environment. For some this is a local LLM for others anything that directly integrates with their preferred IDE of choice.

fsndz 3 days ago

openai is basically eating all the GPT wrappers over time, as well as integrating all the interesting work done at prompt level (cot, structured outputs) at the model level. I wonder if that trend is doomed to continue as profitability issues become central to openai and they need to take an ever bigger share of the AI application cake to survive (coding/writing tools etc): https://www.lycee.ai/blog/why-sam-altman-is-wrong

ijidak 3 days ago

Question. Are there any great AI plug-ins anyone recommends for Sublime Text?

For me it's still the cleanest editor.

VS Code is way too cluttered to be my daily driver for basic editing.

modeless 3 days ago

I thought I didn't have this yet but turns out it's there, just buried in the drop down model selection menu. I tried it out on an existing chat thread where I was writing JavaScript and when I said "use canvas" it started rewriting my JavaScript to use HTML5 <canvas>. I had to tell it "use canvas, not html5 canvas but chatgpt canvas" and then it figured it out.

It seems like this will be perfect for making tiny single page HTML/JS apps.

nomilk 3 days ago

How do we actually use this?

I tried selecting 'ChatGPT 4o with canvas' from the model drop down, uploading a code file, and asking "can we look at this file, I want to edit it with you", but it doesn't show canvas features or buttons that the instructional video has i.e. the UI still looks identical to ChatGPT.

EDIT: I asked "where are the canvas features" and boom - the UI completely changed what the instructional video has.

bux93 3 days ago

The part about writing blogs for you and adding emoji is the real deal here - finally I can express myself on LinkedIn!

infecto 2 days ago

Amazing implementation. I wish Jetbrains had implemented something like this within their IDEs instead of doing whatever they did. Tried their AI solution out again and for whatever reason its just slow and not good. The line complete is no where at the same level as GH Copilot.

vercantez 3 days ago

Wish they explained how they perform targeted edits. Is it doing speculative decoding like Anthropic’s code edit or something like Aider’s diff format?

  • rglover 3 days ago

    Probably even simpler. Give the full context and then add the text selection as an additional message step/context of what to edit.

sramsay 3 days ago

Everyone in this thread is talking about coding, but this is, perhaps more significantly, a word processor. “Change reading level: Adjusts the reading level, from Kindergarten to Graduate School.” “Adjust the length: Edits the document length to be shorter or longer.” “Suggest edits: ChatGPT offers inline suggestions and feedback.”

textlapse 3 days ago

I expected the last line of the blog post to have said something like "this blog post was created using the help of Canvas" - a missed opportunity or the product not being there yet?

Also is this in response to the recent notebooklm which seems awfully too good as an experiment?

sunny_sigara 21 hours ago

Is this free on Mac with the native app ?

matt3210 3 days ago

I hate it when my coworker sends me 10 page emails that are AI generated. For non-serious IP I put it back into an AI to compress it into bullet points. Maybe just send me bullet points instead?

mmaunder 3 days ago

Anyone actually got access to this?

  • SeanAnderson 3 days ago

    Yeah I do. Playing with it right now. It's cool.

    https://i.imgur.com/R5PQQoi.png

    • CubsFan1060 3 days ago

      How/where were you notified that you got it?

      • zamadatix 3 days ago

        As another data point: I wasn't notified either. I didn't have it when I first read about Canvas here earlier but coming back to the topic now I do have the feature. This is for a Teams account, the post says Plus and Teams are the first to get it rolled out with Enterprise and other in a week.

        The easiest way to check if you have access is it will appear as an explicit choice in the "Model" selector.

      • SeanAnderson 3 days ago

        I wasn't. I saw this post on HN, opened a new tab for ChatGPT, and saw that I had access to the model. I assume it's rolling out incrementally over a few hours to all paid users.

        • bagels 3 days ago

          Thanks, I found that I had access too, on the web. Just open up the chat gpt page and use the model drop down at the top of the page.

          You don't get the new experience until you give it a prompt though, which is kinda weird.

bearjaws 3 days ago

Good lord we do not need more fancy editors to produce AI slop. You can already do this with a myriad of solutions, including just that ChatGPT interface with o1.

No matter what there will be so many GPT-isms, and people will not read your content.

cschneid 3 days ago

It seems sort of weird to keep pushing the chat interface so hard into programming. For 'real' usage, it seems like Cursor or Aider approaches work better, since you end up having AI write code, you manually edit, AI updates further, and back and forth. In a chat interface, copy/pasting updated code gets old fast.

On the other hand, I did have good luck w/ Anthropic's version of this to make a single page react app with super basic requirements. I couldn't imagine using it for anything more though.

  • Benjaminsen 3 days ago

    Pretty sure this will dynamically rewrite the code. No copy pasting needed. We have something very similar at FRVR.ai

SeanAnderson 3 days ago

I'm playing around with this right now and it's pretty sweet. It real-time shows which lines it's "thinking" about working and feels very dynamic, like I'm working with a machine in real-time.

It can't display markdown and formatted code side-by-side which is kind of a surprise.

I haven't tried doing anything super complex with it yet. Just having it generate some poems, but it's smart enough to be able to use natural language to edit the middle of a paragraph of text without rewriting the whole thing, didn't notice any issues with me saying "undo" and having data change in surprising ways, etc. So far so good!

I'm not very skilled at creating good "test" scenarios for this, but I found this to be fun/interesting: https://i.imgur.com/TMhNEcf.png

I had it write some Python code to output a random poem. I then had it write some code to find/replace a word in the poem (sky -> goodbye). I then manually edited each of the input poems to include the word "sky".

I then told it to execute the python code (which causes it to run "Analyzing...") and to show the output on the screen. In doing so, I see output which includes the word replacement of sky->goodbye.

My naive interpretation of this is that I could use this as a makeshift Python IDE at this point?

  • WD-42 3 days ago

    I’m kinda giggling imaging the amount of electricity you used to write “sky..”.replace(“sky”, “goodbye”)

  • coalteddy 3 days ago

    How do I get access to this feature? I cannot find it in the normal chatgpt interface.

    • SeanAnderson 3 days ago

      It's a staged rollout. You'll probably have it by tomorrow morning.

    • aaronharnly 3 days ago

      I believe you wait until your number comes up :/

    • whimsicalism 3 days ago

      it's under the model list on the web interface

  • unshavedyak 3 days ago

    Is it possible to have the local context be a directory and all files within it or something? Ie to just ask it questions on the side of your WIP repo? Use your normal editor/etc.

    • SeanAnderson 3 days ago

      Doesn't look like this is yet possible, no

throwgfgfd25 3 days ago

Do they not understand that the example text they are using in the first image is so laughably banal that it makes an entire segment of its potential audience not want to engage at all?

Shoot me in the face if my own writing is ever that bad.

ETA: just to be clear... I am not a great writer. Or a bad one. But this is a particular kind of bad. The kind we should all try to avoid.

  • janalsncm 3 days ago

    > Do they not understand

    I see this all the time from AI boosters. Flashy presentation, and it seems like it worked! But if you actually stare at the result for a moment, it’s mediocre at best.

    Part of the issue is that people who are experts at creating ML models aren’t experts at all the downstream tasks those models are asked to do. So if you ask it to “write a poem about pizza” as long as it generally fits the description it goes into the demo.

    We saw this with Gemini’s hallucination bug in one of their demos, telling you to remove film from a camera (this would ruin the photos on the film). They obviously didn’t know anything about the subject beforehand.

    • throwgfgfd25 3 days ago

      > Part of the issue is that people who are experts at creating ML models aren’t experts at all the downstream tasks those models are asked to do.

      Yep. CAD, music, poetry, comedy. Same pattern in each.

      But it's more than not being experts: it's about a subliminal belief that there either isn't much to be expert in or a denial of the value of that expertise, like if what they do can be replicated by a neural network trained on the description, is it even expertise?

      Unavoidably, all of this stuff is about allowing people to do, with software, tasks they would otherwise need experts for.

      • janalsncm 3 days ago

        Well, comedians still exist, despite the fact that ChatGPT can write an endless stream of “jokes” for next to zero cost. So do musicians. I know less about poetry and CAD but I assume people who seek out those modalities aren’t going to be impressed with generic garbage. A person who seeks out poetry isn’t going to be easily impressed.

        • throwgfgfd25 3 days ago

          No. But then all of these products are marketed to people who are, at some domain-specific level, still towards the "but I wore the juice!" end of the scale, right?

          Unskilled and unaware of it. Or rather, unskilled and unaware of what a skilled output actually involves. So, unaware of the damage they do to their reputations by passing off the output of a GPT.

          This is what I mean about the writing, ultimately. If you don't know why ChatGPT writing is sort of essentially banal and detracts from honesty and authenticity, you're the sort of person who shouldn't be using it.

          (And if you do know why, you don't need to use it)

  • yawnxyz 3 days ago

    This means you're a great writer — congrats! I'm a terrible writer, and this kind of crutch is really useful.

    Other people in our lab (from China, Korea, etc.) also find this kind of thing useful for working / communicating quickly

    • throwgfgfd25 3 days ago

      Well, I've just read back through some of your comments and I say that ain't so!

      Write honestly. Write the way you write. Use your own flow, make your own grammatical wobbles, whatever they are. Express yourself authentically.

      Don't let an AI do this to you.

      • meiraleal 3 days ago

          Person A: Me try make this code work but it always crash! maybe the server hate or i miss thing. any help?
        
          Person A with AI: I've been trying to get this code to work, but it keeps crashing. I'm not sure if I missed something or if there's an issue with the server. Any tips would be appreciated!
        
        For a non-native English speaker, it's much better professionally to use AI before sending a message than to appear authentic (which you won't in another language that you aren't fluent so better to sound robotic than write like a 10 years old kid).
        • gloflo 3 days ago

          Person A with AI: In the bustling world of software development, where lines of code intertwine to create the intricate tapestry of our digital lives, I find myself facing a challenge that has proven to be both perplexing and frustrating. I’ve spent over a decade honing my skills as a developer. Known for my analytical mind and commitment to excellence, I’ve navigated various programming languages, frameworks, and projects that I’m proud to have contributed to.

          Recently, I stumbled upon a bug that initially seemed minor but quickly revealed itself to be a formidable adversary. It disrupted the seamless user experience I had meticulously crafted, and despite my best efforts, this issue has remained elusive. Each attempt to isolate and resolve it has only led me deeper into a labyrinth of complexity, leaving me frustrated yet undeterred.

          Understanding that even the most seasoned developers can hit a wall, I’m reaching out for help. I’ve documented the symptoms, error messages, and my various attempts at resolution, and I’m eager to collaborate with anyone who might have insights or fresh perspectives. It’s in the spirit of community and shared knowledge that I hope to unravel this mystery and turn this challenge into an opportunity for growth.

        • Spivak 3 days ago

          It's pretty good for native English speakers at work who need/want a reverse anger translator.

          Me: This is the most garbage code I've ever seen. It's bad and you should feel. It's not even wrong. I can't even fathom the conceptual misunderstandings that led to this. I'm going to have to rewrite the entire thing at this rate, honestly you should just try again from scratch.

          With AI: I've had some time to review the code you submitted and I appreciate the effort and work that went into it. I think we might have to refine some parts so that it aligns more closely with our coding standards. There are certain areas that are in need of restructuring to make sure the logic is more consistent and the flow wouldn't lead to potential issues down the road.

          I sympathize with the sibling comment about AI responses being overly-verbose but it's not that hard to get your model of choice to have a somewhat consistent voice. And I don't even see it as a crutch, this is just automated secretary / personal assistant for people not important enough to be worth a human. I think a lot of us on HN have had the experience of the stark contrast between comms from the CEO vs CEO as paraphrased by their assistant.

      • yawnxyz 3 days ago

        Aw thanks! I at least have the benefit of being a fluent writer.

        For lots of East Asian researchers it's really embarrassing for them to send an email riddled with typos, so they spend a LOT of time making their emails nice.

        I like that tools like this can lift their burden

        • throwgfgfd25 3 days ago

          > For lots of East Asian researchers it's really embarrassing for them to send an email riddled with typos, so they spend a LOT of time making their emails nice.

          OK -- I can see this. But I think Grammarly would be better than this.

          • asukumar 3 days ago

            Grammarly uses generative AI

            • throwgfgfd25 3 days ago

              It does now, perhaps, for complete rewrites. I've not looked recently.

              But its suggestion system, where it spots wordy patterns and suggests clearer alternatives, was available long before LLMs were the new hotness, and is considerably more nuanced (and educational).

              Grammarly would take apart the nonsense in that screenshot and suggest something much less "dark and stormy night".

      • j_maffe 3 days ago

        Thanks for saying this. Whenever Grammarly puts a red line under a slightly superflouos part of the sentence I get more and more agitated at this small nudging to robotic writing.

        • yawnxyz 3 days ago

          Grammarly thinks all writing should be bland, and that everyone needs to be a robot. Terrible product.

          • throwgfgfd25 3 days ago

            But it does favour _clarity_, rather than tropes.

            • j_maffe 3 days ago

              There's more to writing than clarity, though. Not all written communication needs to abide of the efficient/clear writing style of technical documentation FFS

              • throwgfgfd25 3 days ago

                Sure, if you're writing a novel, maybe.

                But there's not much more important, stylistically, to writing an business email or document than clarity. It's absolutely the most important thing. Especially in customer communications.

                In the UK there is/used to be a yearly awards scheme for businesses that reject complexity in communucations for clarity:

                https://www.plainenglish.co.uk/services/crystal-mark.html

                But anyway, you don't have to act on all the suggestions, do you? It's completely different from the idea of getting an AI to write generic, college-application-letter-from-a-CS-geek prose from your notes.

                • j_maffe 2 days ago

                  With enough repetitive suggestions asking for the same thing, it will just continuously push your writing style towards this ultra-dry writing. Plus, even in business emails it's important to show a human side in writing. It's not like Grammarly's push for clear writing actually helps in any way. Most times it just outright suggests removing relevant info from the sentence. They just push for this service as a way to incentivise subscriptions, writing quality be damned.

          • j_maffe 3 days ago

            More red lines means more subscribers, right?

    • whimsicalism 3 days ago

      you're not at all a terrible writer... although you do overuse ellipses in your comments.

      • yawnxyz 3 days ago

        I never even thought about that... I don't know why I do that :P

  • BugsJustFindMe 3 days ago

    > Do they not understand

    They don't care. Their goal is to accelerate the production of garbage.

    • csomar 3 days ago

      I am trying to convince myself that I am not insane and everyone else is. The platform was literally down for me for a good 12 hours or so because they had an auth problem or bug. Their interface is subpar yet they are trying to convince people that this is replacing knowledge worker any minute now. I recommended to a friend that he uses chatGPT to write some English content and it did a bad job. I checked bolt yesterday and the code it produced for a very simple app was complete garbage hallucination.

      I really like copilot/ai when the focus was about hyper-auto-complete. I wish the integration was LSP+autocomplete+compilation check+docs correlation. That will boost my productivity x10 times and save me some brain cycles. Instead we are getting garbage UX/Backends that are trying to fully replace devs. Give me a break.

      • mvdtnz 3 days ago

        I'm with you. I feel like I'm losing my mind. Everyone around me is talking about the looming AGI, death of the knowledge worker and how "everything" has changed. But every time I try to use these text generators I get nothing useful from them. It's like the whole world has bought into a mass hallucination.

        • lovethevoid 3 days ago

          It makes more sense when you realize that while sure, there might be slight variation in output, generally speaking the people tripping over themselves in how [current version] is so amazing aren't being totally honest about why they think it's amazing.

          For them, the ability to generate so much trash is the good part. They might not even be fully aware that it's trash, but their general goal is to output more trash because trash is profitable.

          It's like all those "productivity systems". Not a single one will produce a noticeable increase in productivity magically that you can't get from just a $1 notebook, they just make you feel like you are being more productive. Same with RP bots or AI text editors. It makes you feel so much faster, and for a lot of people that's enough so they want in on a slice of the AI moneypit!

          • adunsulag 3 days ago

            Its a tool, like any other tool a software developer would use. In areas where I have a lot of repetition or need to pour through verbose (but simple) documentation, its such a game changer. I can spend 5 minutes thinking about what I want the machine to do, give it some samples of what I expect the output to be and wala, it generates it, often times 100% correct if I've got the prompt put in properly, sometimes its good enough with a bit of refinement. This is something I would normally have delegated to a junior team member or sub-contractor, but now I'm saving in time and money.

            Occasionally I sink 1-2 hours into a tweaking something I thought was 90% correct but was in reality garbage. I had that happen a lot more with earlier models, but its becoming increasingly rare. Perhaps I'm recognizing the limitations of the tool, or the systems indeed are getting better.

            This is all anecdotal, but I'm shipping and building faster than I was previously and its definitely not all trash.

        • gen220 3 days ago

          Most people are incapable of assessing quality and defer that to others. Or their spectrum for quality is so narrow GPT's output spans it.

          If you accept that we live in a world where blind lead the blind, it's less surprising.

      • dumbfounder 3 days ago

        Garbage in, garbage out. It is not going to imagine your perfect scenario and then create it for you. I take anyone saying it is garbage with a grain of salt because it is incredibly useful for me. And others think so too, so how can your bad experience negate that. It can't. If you can craft the right prompts it can make you much more efficient. Anyone saying it is going to replace whole people en masse is just part of the hype machine. But if all it does is make every human on earth 1% more efficient then that is an obscene amount of value it is creating.

  • mvdtnz 3 days ago

    That exact banality has somehow made them into a 150 billion dollar business and darling of hacker news.

    • csomar 3 days ago

      They plateaued on model performance and they are hype based. They need to keep the momentum going by "releasing" stuff, so they are garbage out at the moment. Given that open weight models are so close to gpt-4, their value is exactly 0 unless they can produce a new model with a significant jump in coherence.

      Them releasing this stuff actually suggest they don't have much progress in their next model. It's a sell signal but today's investors have made their money in zirp, so they have no idea about the real world market. In a sense this is the market funneling money from stupid to grifter.

  • wilde 3 days ago

    TBF it looks like it’s intended as a “before” image but yes suspect the “after” isn’t much better

    • throwgfgfd25 3 days ago

      Is it? I thought that was the draft, as a result of the dialogue in the sidebar. If I am wrong then OK!

  • WD-42 3 days ago

    I thought the same thing: the “blog post” in the example image is an example of the absolute trash that’s being spewed onto the internet by these tools. 10+ sentences and yet somehow nothing actually said.

    • aflukasz 3 days ago

      Well, the UI has slider for length, so there is that.

      • throwgfgfd25 3 days ago

        "Please say this, but in more sentences"

  • RigelKentaurus 3 days ago

    The text i supposed to be banal, so that ChatGPT can make it better. It's like the before picture in an exercise course.

    • throwgfgfd25 3 days ago

      It's not, is it? It's meant to be the draft it created from the notes.

skybrian 3 days ago

> You can directly edit text or code.

But not run it.

Any online code playground or notebook lets you both edit and run code. With OpenAI it's either one or the other. Maybe they'll get it right someday.

moonmagick 3 days ago

Yawn. I don't use Claude because the interface is good. I use it because Opus 3 is the best model anyone has ever created for long context coding, writing and retrieval. Give me a model that doesn't have polluted dataset to game MMLU scores, something that tangibly gives good results, and maybe I'll care again.

For now I only keep ChatGPT because it's better Google.

  • zamadatix 3 days ago

    I've found Sonnet 3.5 significantly better than Opus 3 at coding but I've not done much long context coding with it. In your experience did you find Opus 3 to degrade less or is it that you consider Sonnet 3.5 part of the "gamed" group?

  • heyjamesknight 3 days ago

    Have you used Gemini? With the built-in RAG I actually find it way better than both Google Search and OpenAI for search. I think Claude still wins for overall chat quality but Gemini is amazing for Search, especially when you're not exactly sure what you're looking for.

    Disclaimer: I work at Google Cloud, but I've had hands-on dev experience with all the major models.

    • moonmagick 3 days ago

      Initially it had some real problems. large context window-- but you can only paste 4k tokens into the UI, for example. It never seemed like anyone at Google was using it. NotebookLM is a great interface, though, with some nice bells and whistles, and finally shows what Gemini is capable of. However, Opus still has the best long context retrieval with the least hallucination from what I've tried.

      3.5 Sonnet is fast, and that is very meaningful to iteration speed, but I find for the level of complexity I throw at it, it strings together really bad solutions compared to the more wholistic solutions I can work through with Opus. I use Sonnet for general knowledge and small questions because it seems to do very well with shorter problems and is more up-to-date on libraries.

    • staticman2 3 days ago

      I don't know that I've ever seen someone recommend Gemini Advanced for "search". My experience is the model doesn't always tell you if it's using search or it's internal training, in fact I'm not sure if it even is "searching" the internet rather than accessing some internal google database.

      In comparing it's performance to the pure model on Google AI studio I realized Gemini was presenting some sort of RAG results as the "answer" without disclosing where it got that information.

      Perplexity, which is hardly perfect, will at least tell you it is searching the web and cite a source web page.

      I'm basically saying Gemini fails at even the simplest thing you would want from a search tool: disclosing where the results came from.

andrewstuart 3 days ago

I’ve found Claude and ChatGPT to be the best coding tools for me.

I often throw the results from one into the other and ping pong them to get a different opinion.

  • CamperBob2 3 days ago

    Ditto with Gemini and o1-preview. Ask one of them to write some code, then paste it into the other and ask it if it can find any bugs. Lather, rinse, repeat. It's proven helpful a couple of times so far.

throwaway4233 3 days ago

OpenAI seem to be slowly moving towards building an entire OS(or any other form of walled garden) that will be filled with LLM powered applications.

  • poisonborz 3 days ago

    The abuse the term "OS" gets is incredible

  • bhl 3 days ago

    You get tighter feedback loops and better products if you own the vertical.

    OpenAI already A/Bs test the responses it generates. Imagine if they own the text editor or spreadsheet you work on too. It’ll incorporate all of your edits to be self-correcting.

tsunamifury 3 days ago

I appreciate the elegant product design and ability to refine to a greater degree but I have to ask myself …

We’ve compressed the world’s knowledge into a coherent system that can be queried for anything and reason on a basic level.

What do we need with content anymore? Honestly. Why generate this. It seems like a faux productivity cycle that does nothing but poorly visualize the singularity.

Why not work on truly revolutionary ways to visualize the make this singularity so radically new things? Embody it. Maps its infinite coherence. Give it control in limited zones.

Truly find its new opportunities.

ada1981 3 days ago

Pretty dope. I like the feature in claude. I also like the visual update. It does seem to glitch out if you get past 2,000 words though.

u2077 3 days ago

> We used novel synthetic data generation techniques, such as distilling outputs from OpenAI o1-preview, to post-train the model for its core behaviors. This approach allowed us to rapidly address writing quality and new user interactions, all without relying on human-generated data.

So they took a bunch of human-generated data and put it into o1, then used the output of o1 to train canvas? How can they claim that this is a completely synthetic dataset? Humans were still involved in providing data.

profsummergig 3 days ago

If anyone can clarify what "Comment’s Triggering Correctness" likely means, please do. Thank you.

joshdavham 3 days ago

I'm not sure how much I'll enjoy Canvas for coding (we'll see), but it looks way more optimal for just writing!

nikitaga 3 days ago

Do you reckon this would become available to third party clients via the API, or would it be exclusively a chatgpt.com feature?

Benjaminsen 3 days ago

Very close to the subsystems we build for FRVR.ai - Although their UX is way cooler than ours, we should get inspired.

nprateem 3 days ago

Jesus Christ. Even their example has "in... the fast- paced world of..."

Chatgpt is utter, utter shit at writing anything other than this drivel.

mergisi 3 days ago

Canvas lets you interact with and edit code/documents more fluidly. I used it to transform my HTML blog into TypeScript in no time! Super helpful for coding and experimenting. https://x.com/mustafaergisi/status/1841946224682774536

  • kristianp 3 days ago

    How do you go from html to typescript?

    • zamadatix 3 days ago

      I think they mean at the end of the clip they drag the slider from HTML to Typescript and it puts all of the blog data into structured Typescript data then writes a function which will generate an HTML page from that. The resulting blog output will still eventually be HTML, it's just whether the entries are made as data segments automatically stitched together or entered via raw HTML formatting.

    • mergisi 3 days ago

      I used OpenAI Canvas to help with the conversion. Essentially, I took the static HTML/CSS and wrote TypeScript to dynamically render blog posts.

Simon_ORourke 3 days ago

For a second there I mis-read that as Canva, and thought what the hell?

gdiamos 2 days ago

Nice to see OpenAI adopting useful Claude features

yawnxyz 3 days ago

The most surprising part of this announcement was the team who worked on this — more people worked on and contributed to this than many startups. There are 16 people working on this project!!

If each was paid $300k (that's a minimum...) and they spent a year on this, it'd make it a $5M project...

  • ibbih 3 days ago

    1- very unlikely they spent a year on just this 2- they are certainly paid more as TC

    • yawnxyz 3 days ago

      right so it probably events out to $4-5M for a feature like this? Still surprisingly to me how expensive features like this are to build

pikseladam 3 days ago

openai gpt is a service not a product but canvas is the first product that openai build. i guess we will see new products in future. canvas is a new product, thats why they didn't introduce it at devday.

  • dgellow 3 days ago

    ChatGPT is their first product. Canvas would be the second?

    • pikseladam 3 days ago

      i think chatgpt is not a product but a service like dropbox. i don't say it is useless or can't be sold. it is just a feature for a product. GPT, in itself, is a powerful technology or tool that enhances user experiences in various applications. It provides natural language processing capabilities like answering questions, generating text, assisting with tasks, and so on. However, without a specific context or integration into a larger platform, GPT is more akin to a feature that can improve or enable specific functionalities within products.

      • dgellow a day ago

        I don’t understand what distinction you’re trying to make. Dropbox is a product too. People buy subscriptions for ChatGPT, that’s the product they want. It could be based on whatever model, that’s just the platform used to build the product. Users want the brand and the UX of ChatGPT, the whole bundle. That’s the product OpenAI sells

qiller 3 days ago

Since when "Add emojis for clutter and noise" became an USP...

charlie0 3 days ago

Did they just kill Cursor?

  • nsonha 3 days ago

    Have you actually tried? I have not, but just look at how it still regenerates the entire code instead of producing a diff. Who iterates on software like that?

    • charlie0 3 days ago

      I haven't tried it, but have been considering it. I already pay for OpenAI, not sure I want to pay for another AI service.

      • nsonha 3 days ago

        I am refering to Canvas.

        Cursor on the other hand works by produce minimal diffs and allows you to iterate on multiple files at once, in your IDEs. There are tools of the same type that compete with Cursor, but Canvas is too bare bone to be one of them.

        Trial is free.

paxys 3 days ago

Seems to mostly be a UI enhancement? Or is there deeper functionality?

bilekas 3 days ago

I was really hoping this was a weird interface with html canvas.

lionkor 3 days ago

Can't wait for more bullshit PRs to our projects! Thanks to AI, anyone can open a PR that gets instantly rejected.

  • vultour 3 days ago

    Looks like you're missing an AI to auto-close the PRs for you.

    Seriously though, I'm tired of the "helpful" GitHub bots closing issues after X days of inactivity. Can't wait for one powered by AI to decide it's not interested in your issue.

    • lionkor 2 days ago

      Those bots are great - the maintainers dont bother fixing the issue, so it goes stale, gets auto-closed, and boom! Metrics go up. Another issue successfully closed! Another bug fixed! Until someone else opens a new issue for it, which can then be closed as duplicate. I love it!

stainablesteel 3 days ago

glad to see that, its the next natural step after seeing select edits being possible with images

shaunxcode 3 days ago

in a word : yuck : show your work : cite your sources : or say nothing at all

caocanx 3 days ago

Can someone save Xcode?

ArmedSandwich 2 days ago

Canvas is cool and all... but I still feel like Claude has a bit of an edge in this realm at the moment. Now I may be wrong but the test cases I ran on both this morning just had wildly different outcomes, with Claude's still being ahead of ChatGPTs

wseqyrku 3 days ago

this would be incredible for scripting.

FlamingMoe 3 days ago

Why am I always last to get the cool new features??

amelius 3 days ago

Can this be used to refactor a codebase?

Or is it meant to be used on just a single file?

  • yawnxyz 3 days ago

    Looks like this beta is single file, like a chat instance. They just added Github integration for enterprise, so that's probably on the horizon

  • zamadatix 3 days ago

    The single file aspect of many of these answers is what grinds me as well. I mean, it's fantastic for a short script, a function/class template, or a crack at the syntax error... but it becomes a huuuuge pain even when it's something as simple as a .h for the file you're working on.

  • lionkor 3 days ago

    What's refactoring? Usually when we say refactoring we want to get from one state into another - like making code testable.

    Do you want AI to do this for you? Do you trust that it will do a good job?

    • yawnxyz 3 days ago

      I've done a ton of refactoring, from Python to Node / Deno, and it's surprisingly good — but not perfect.

      Having it create a testing suite definitely helps. But it makes fewer mistakes than I would normally make... it's not perfect but it IS way better than me.

anonzzzies 3 days ago

Why don't companies learn from the really best times of Apple; announce; immediately available. Sure I know why but that used to be why I liked them. This marketing grift is terrible.

  • arcatech 3 days ago

    Apple Intelligence won’t be available for months.

    • anonzzzies 3 days ago

      I am not talking about Apple now: 10 years ago they announced and I could order it the same night. That is magic. The rest is just; yeah who cares.

    • durandal1 3 days ago

      Pre-announcing makes no sense when you're leading the pack, it's a stall tactic for your customers to not leave when trying to catch up.

FactKnower69 3 days ago

...woof. seems like we're already to the point where every openAI product launch is about half as interesting as the last one

timeon 2 days ago

Now they need to make another app to read all of that generated text.

badgersnake 3 days ago

Yawn, it’s still ChatGPT so it’ll still be useless.

dheera 3 days ago

yawn Let me know when I can download it.

joshstrange 3 days ago

The issue I continue to have with many AI coding tools is they want me to use their own editor ("native", aka VSCode fork, or in the browser like this). I have zero intention of moving away from IDEA and nothing I've seen so far is good enough to make me want to switch. I really with there was more of "bringing AI into your existing tools/workflows" instead of "here is a new tool with AI baked in".

  • jdiez17 3 days ago

    I have started using Claude Dev (an extension for VSCode - https://github.com/saoudrizwan/claude-dev), and so far my impression has been very positive. It’s a full blown code agent that looks for relevant files in the code base, can ask you to run commands and modify files etc. You use your own Anthropic API key or self hosted model.

    • ijidak 3 days ago

      Are there any great AI plug-ins anyone recommends for Sublime Text?

      For me it's still the cleanest editor.

      VS Code is way too cluttered to be my daily driver for basic editing.

      • brunoqc 3 days ago

        Is the zed-editor not too cluttered for you?

        If not, it has some ai support.

    • j45 3 days ago

      Sounds a little like aider.chat

      • jdiez17 3 days ago

        It is quite similar but I found aider a bit clunky to use in that it creates a new commit with a huge message being the whole conversation and context. Which can be a good thing of course, but for most things I'd rather accumulate changes until a feature is finished, then I commit.

        • trees101 3 days ago

          I use aider with the --no-auto-commits flag. Then review code and do manual edits in VSCode, as well as manual git process. It achieves exactly what you're saying.

          aider --sonnet --no-auto-commits --cache-prompts --no-stream --cache-keepalive-pings 5 --no-suggest-shell-commands`

          • j45 2 days ago

            This is a great tip, appreciate it

        • j45 3 days ago

          I remember that... but I think it's changed now.

          Aider remains to me one of the places where innovation happens and it seems to end up in other places. Their new feature to architect with o1 and then code with sonnet is pretty trippy.

          Only can run so many IDEs at a time though.

        • dannyobrien 3 days ago

          I think the default is not to do this anymore (at least the whole convo and chat aren't in the commit). It is strangely scary to have it commit on every change, even if that's probably objectively the right thing for it to do (so you can roll back, so that commits are atomic, etc, etc).

        • rorytbyrne 3 days ago

          Just work on a branch and squash.

    • knowaveragejoe 3 days ago

      Claude Dev does have a really nice UI nailed down. I was not aware that you could run local models yourself with it, which would be a great feature but kind of obviates the name.

  • Bnjoroge 3 days ago

    The problem is that tacking on to an existing product, while the ideal approach, limits just how creative you can get. I believe this is one of the reasons Cursor had to fork VSCode. Simply being an extension limited the features they could build.

    • joshstrange 3 days ago

      I completely get this and I don’t have a good answer to the issue. I almost wish I liked VSCode since almost all AI editors are forks of that (not sure what the extension compatibility story is), but prefer IDEA.

      • NitpickLawyer 3 days ago

        > (not sure what the extension compatibility story is)

        So far I've had all the vscode extensions just work in cursor (including devcontainers, docker, etc.) I hope it continues like this, as breaking extensions is something that would take away from the usefulness of cursor.

      • yas_hmaheshwari 3 days ago

        I am also using intelliJ, but have started tinkering with Cursor

        My hunch says that IDEA should be worried a lot. If I am on the edge evaluating other tools because of AI assisted programming, lot of others would be doing that too

    • ford 3 days ago

      The other problem is IDEs are incentivized to build their own AI coding tools instead of letting other people plug in

      • 8n4vidtmkvmk 3 days ago

        IDEA allows you to customize the UI quite a bit, has plugin hooks specifically for AI tools, and has several 3rd party plugins already.

        None of these points seem to apply..

        They're still selling their yearly subscription even if they can't upsell me on an AI subscription

    • zbobet2012 3 days ago

      Emacs has none of these problems :p

      • mark_l_watson 2 days ago

        True. I have been very happy using ellama.

    • curiouscavalier 3 days ago

      Solid point. We’re building in a space adjacent to this and it definitely feels hard to balance removing friction for adoption and delivering everything we want.

      • zmgsabst 3 days ago

        Okay, but why does what you want matter?

        I find a lot of teams are so focused on their vision that they fail to integrate their tool into my workflow. So I don’t use them at all.

        That’s fine for art, but I don’t need opinionated tools.

        • curiouscavalier 2 days ago

          Who said what we want has anything to do with our vision. Please don’t make assumptions about what drives our product development.

          What I want in a product comes from customer interviews. It’s not “my opinion” other than perhaps our team’s interpretation of customer requests. A customer can want certain pain points addressed and have friction to move to a particular solution at the same time.

          Or does wanting a product that meets customer needs too opinionated?

  • TiredOfLife 3 days ago

    Jetbrains have their own AI.

    There is also https://codeium.com/jetbrains_tutorial I have been using the free tier of it for half a year, and quite like it.

    Supermaven has https://plugins.jetbrains.com/plugin/23893-supermaven also good free tier. (Although they recently got investment to make their own editor.)

  • grbsh 3 days ago

    I feel the exact same! I built this tool to make it much easier for me to bring LLMs into existing workflows: https://github.com/gr-b/repogather

    It helps find relevant content to copy to your clipboard (or just copies all files in the repo, with exclusions like gitignore attended to) so you can paste everything into Claude. With the large context sizes, I’ve found that I get way better answers / code edits by dumping as much context as possible (and just starting a new chat with each question).

    It’s funny, Anthropic is surely losing money on me from this, and I use gpt-mini via api to compute the relevancy ratings, so OpenAI is making money off me, despite having (in my opinion) an inferior coding LLM / UI.

    • starfezzy 3 days ago

      I do something similar with a script. Don’t know if anything about my approach could improve yours, but I’m also curious if you have any suggestions?

      - Mine prepends the result with the output of running `tree -I node_modules --noreport` before any other content. This informs the LLM of the structure of the project, which leads to other insights like it will know which frameworks and paradigms your project uses without you needing to explain that stuff. - Mine prepends the contents of each included file with “Contents of relative/path/to/file/from/root/of/project/filename.ts:” to reinforce the context and the file’s position in the tree.

  • ipsod 3 days ago

    Tabnine has an IDEA plugin. It's not quite as good as Cursor, in my opinion, but it's better to have Tabnine and IDEA than Cursor and VSCode.

    It started out as just predictive text, but now it has a chatbot window that you can access GPT, Claude, etc. from, as well as their own model which has better assurances about code privacy.

  • machiaweliczny 2 days ago

    Just use aider - it’s CLI and open source. You can open terminal in IDE and you just use /add file for context then /drop to clean context. Better than Copilot and just less worse UX than Cursor (which has benefit of knowing which page you have open). There’s also Claude engineer but not tested it yet.

    Forgot to mention killer feature - droping links to docs automatically fetches them in aider which helps with grounding for specific tasks

    • rsp1984 2 days ago

      Look, I'm just not interested in something that gives me code on the CLI. This is no better or worse than using ChatGPT / Canvas or any other external tool.

      My project has well over half a million lines of code. I'm using an IDE (in my case Qt Creator) for a reason. I'd love to get help from an LLM but CLI or external browser windows just aren't the way. The overhead of copy/paste and lack of context is a deal breaker unfortunately.

      In case I'm missing something, please let me know. I'm always happy to learn.

      • nuancebydefault 2 days ago

        As i understand things you...

        - want to use your particular ide which does not have the llm plugin.

        - don't want to use any of several ide's that support several llm's using a picker.

        - don't want to use copy/paste to a web browser or other tool

        - don't want to use 2 ide's at the same time if 1 of them is not your favorite

        I would settle for the 3rd or 4th option, both work very well for me.

      • JackC 2 days ago

        What I'm trying right now is two IDEs -- PyCharm for navigating around and static analysis stuff, Cursor for "how do I do this" or "please do this for me." Cursor (VSCode?) lets you choose Jetbrains keyboard shortcuts during setup and is relatively resource light so it's not too bad.

      • autoconfig 2 days ago

        Aider operates on your file tree / repo and edits and creates files in place. So it at least lessens the copy / paste drastically. This is a very different experience than using chatgpt or Claude on web. Still not ideal UX compared to having it in the IDE though to be clear.

      • Novosell 2 days ago

        I'd never heard of Qt creator before so I checked it out and their whole front page seems to focus on ai integration?

        • rsp1984 2 days ago

          Cool, this must be fairly recent. I didn't realize they had integrated it. Will check it out, thanks!

  • mooktakim 3 days ago

    This is not an issue. Technology moves forward. You don't adapt, you fall behind. There were other editors and IDE's before the one you use. New devs will use it.

    Anyway, I don't use them either. I prefer to use ChatGPT and Claude directly.

    • atlantic 3 days ago

      Technology also moves into dead ends. Not every change is progress. You can only tell a posteriori which paths were fruitful and which were not.

      • mooktakim 2 days ago

        Everything ends. Even things you used for a long time.

        • vbezhenar 2 days ago

          Almost every program I've used 20 years ago still available today. I think that I switched from Eclipse to Idea like 15 years ago, but Eclipse is still rocking. IT really frozen in 1990s. OS didn't change at all, they just switch fancy colors and border radius every few years. Software is the same, they just add more annoying bugs and useless features, but nothing really changes. I'm still using the same unix shell and unix tools I've used 20 years ago, I'm still greping and seding files around.

          • mooktakim 2 days ago

            Stone tablets and chisel technically still available also.

    • joshstrange 3 days ago

      Overall I agree with everything you’ve said and I also use ChatGPT and Claude directly. The issue is that:

      Good at integrating AI into a text editor != Good at building an IDE.

      I worry about the ability for some of these VSCode forks to actually maintain a fork and again, I greatly prefer the power of IDEA. I’ll switch if it becomes necessary, but right now the lack of deep AI integration is not compelling enough to switch since I still have ways of using AI directly (and I have Copilot).

      • mooktakim 3 days ago

        I'm guessing using AI will fundamentally change how IDE even works. Maybe everything IDE's offer right now is not needed when you have a copilot you tell what to do.

        I'm a long term vim user. I find all the IDE stuff distracting and noisy. With AI makes it even more noisy. I'm guessing the new generation will just be better at using it. Similar to how we got good at "googling stuff".

    • saagarjha 3 days ago

      My coworkers do just fine with vim.

      • mooktakim 3 days ago

        "past performance is not indicative of future results"

        • stefankorun 3 days ago

          Is it not though? It's not a guarantee but definitely an indication.

          • mooktakim 3 days ago

            Not really. Only thing you can guarantee is things change.

            • NeoTar 2 days ago

              Let’s just throw away all past experience then?

              It’s a mistake to assume that there will be 100% correlation between the past and future, but it’s probably as bad of a mistake to assume 0% correlation. (Obviously dependant on exactly what you are looking at).

              • chiefalchemist 2 days ago

                0% maybe not. But it's the outliers and the didn't see that comings that kill ya. Sometimes literally.

                So while the odds at the extremes are low, they cannot be ignored.

                No one can predict the future. But those that assume tomorrow will be like today are - per history - going to be fatally wrong eventually.

              • mooktakim 2 days ago

                So the choices are 100% or 0%?

                • NeoTar 2 days ago

                  That’s my point – they are not. Your previous comment implied to me a belief that any attempt to draw inference from past events was doomed to failure!

                  Each circumstance is different. Sometimes the past is a good guide to the future – even for the notoriously unpredictable British weather apparently you can get a seventy percent success rate (by some measure) by predicting that tomorrows weather will be the same as todays. Sometimes it is not - the history of an ideal roulette wheel should offer no insights into future numbers.

                  The key is of course to act in accordance with the probability, risk and reward.

                  • mooktakim 2 days ago

                    I did not speak with certainty. Everything I said is guess and opinion.

        • pera 2 days ago

          vim is the "just put your money in an index fund" of text editors

    • shafyy 2 days ago

      This is exactly what OpenAI and other want you to believe. "OH NO, I need to use LLMs for coding otherwise I will fall behind". No, no. Most of what makes a good software engineer cannot be replaced by LLMs. A good software engineer has a deep understanding of the problem space, works on the right things, and elevates their team members by coaching, helping etc. It's not about how fast you type your code.

      • mooktakim 2 days ago

        There's still time to find out if what you say is true

    • dartos 2 days ago

      I refuse to believe there were ever editors before vim.

      Vim has been around since the Stone Age.

      Jokes aside, I don’t really see why ai tools need new editors vs plugins EXCEPT that they don’t want to have to compete with Microsoft’s first party AI offerings in vscode.

      It’s just a strategy for lock-in.

      An exception may be like zed, which provides a lot of features besides AI integration which require a new editor.

      • mooktakim 2 days ago

        They probably said the same thing when someone created vim, or vi.

        • dartos 2 days ago

          Sorry, I’m not understanding what you mean.

          Vi and vim were never products sold for a profit.

          Who was saying what? And what were they saying?

          EDIT: ah I think I understand now.

          The thing is, I don’t see any advantage to having AI built into the editor vs having a plug-in. Aider.vim is pretty great, for example.

          The only reason to have a dedicated editor is a retention/lock in tactic.

          • mooktakim 2 days ago

            Every time there's a new editor, or anything else, people complain why we need another one. Sometimes that new thing is innovative.

            • dartos 16 hours ago

              Sure, I just don’t see what an AI first editor would have over vscode, vim, or whatever already exists + an extension.

              The editor handles the human to text file interface, handling key inputs, rendering, managing LSPs, providing hooks to plugins, etc. AI coding assistants kind of sits next to sits it just handles generating text.

              It’s why many of these editors just fork vscode. All the hard work is already done, they just add lock in as far as I can tell.

              Again, zed is an exception in this pack bc of its CRDT and cooperative features. Those are not things you can easily add on to an existing editor.

    • hackable_sand 2 days ago

      Falling behind what?

      • mooktakim 2 days ago

        If I knew answer to that question I wouldn't be falling behind

        • hackable_sand 2 days ago

          That's cool

          So you're just out here wasting my time

          See you

  • mtam 3 days ago

    Have you considered this one: https://github.com/continuedev/continue

    • icelancer 3 days ago

      I used this while Cursor was broken (Pylance problems), but Continue's code replace tooling sometimes will delete huge swaths of adjacent code. I've filed a comprehensive ticket in their repo and they're working on it, but I've been able to reproduce the problem recently.

      I think it has to do with Cursor's much better custom small models for code search/replace, but can't be sure.

    • constantlm 3 days ago

      I didn't realise Continue had a Jetbrains IDE plugin. Neat! Going to give it a go.

    • joshstrange 3 days ago

      I tried it a while back and had a lot of trouble getting it to work, it's on my list to try again. I also tried Sourcegraph's Cody and just constant errors even after paying for the Pro plan so now I'm back on Copilot.

      • kc_sourcegraph 3 days ago

        Hey, product manager from sourcegraph here. Sorry to hear you got errors. What were the errors, and where were you using Cody (VS Code, Web, or JetBrains)?

        • joshstrange 3 days ago

          JetBrains IDEA. Here is at least one of the errors I got:

          https://github.com/sourcegraph/jetbrains/issues/1306

          • kc_sourcegraph 3 days ago

            Gotcha. Our JetBrains IDE has improved a lot since then, we've been focusing a ton on making performance better. If you could, would love to get you trying Cody again! I can even throw in 1 month free for you, I think I see your customer profile in Stripe :)

      • phist_mcgee 3 days ago

        Tried to setup it up with intellij. Absolutely infuriating experience trying to get it to connect to a model. No error messages or info.

        Really wouldn't recommend this in its current state.

    • usaphp 3 days ago

      I just tried it, has some bugs and feels unpolished, but it works for Jetbrains, which is a game changer for me too

  • Terretta 3 days ago

    > AI coding tools want me to use their own editor

    Instead of putting the AI in your IDEA, put it in your git repo:

    https://aider.chat/

    • anonzzzies 3 days ago

      And copilot workspace for people using github.

  • faangguyindia 3 days ago

    The best way I've found are aider and Zed editor's ai integration is crazy good. (It lets u delete AI response unlike Cursor AI)

    Zed is lightening fast.

    Wish it had more features.

  • sureglymop 3 days ago

    So, I think that with the agent frameworks that exist now, that wouldn't be very hard to realize. What's needed though would be a full API for the editor that can be tapped into.

    Ironically, for creating that, these new age code editor startups would probably have more luck with neovim and it's extensive lua API rather than with vs code. (Of course, the idea with using a vs code fork is about capturing the market share it has).

  • KronisLV 2 days ago

    > I have zero intention of moving away from IDEA and nothing I've seen so far is good enough to make me want to switch. I really with there was more of "bringing AI into your existing tools/workflows" instead of "here is a new tool with AI baked in".

    For this, either GitHub Copilot or their own AI plugin seem to work nicely.

    It's kind of unfortunate because creating new plugins for the JetBrains IDEs has a learning curve: https://plugins.jetbrains.com/docs/intellij/developing-plugi...

    Because of this, and the fact that every additional IDE/tool you have to support also means similar development work, most companies out there will probably lean in the direction of either a web based UI, a CLI, or their own spin of VS Code or something similar.

  • campers 3 days ago

    I mainly use CLI tools for AI assistance.

    I'll use Continue when a chat is all I want to generate some code/script to copy paste in. When I need to prepare a bigger input I'll use the CLI tool in Sophia (sophia.dev) to generate the response.

    I use Aider sometimes, less so lately, although it has caught up with some features in Sophia (which builds on top of Aider), being able to compile, and lint, and separating design from the implementation LLM call. With Aider you have to manually add/drop files from the context, which is good for having precise control over which files are included.

    I use the code agent in Sophia to build itself a fair bit. It has its own file selection agent, and also a review agent which helps a lot with fixing issues on the initial generated changes.

  • lemming 3 days ago

    I am actually building something along these lines, IntelliJ native rather than a second class VS Code follow along. Is this something you (or your company) would pay for? Say $50/year for the plugin and you bring your own API keys?

  • d4rkp4ttern 2 days ago

    Zed lets you set it up trivially with jetbrains key maps. It doesn’t have a debugger and rough around the edges but I like that most of my Jetbrains shortcuts continue to work with zed.

  • bagels 3 days ago

    Even if you use vscode, they want you to use a vscode fork... Why not just make a vscode plugin?

    • icelancer 3 days ago

      Continue.dev's plugin is as close as it gets for Cursor, but there are clearly limitations to a VSCode plugin when it comes to the inline editing and code search/replace - made a comment above about it.

  • yunohn 2 days ago

    The primary issue is that IDEA and other editors don’t allow the level of flexibility and customizability needed to make new experiences, unlike VSCode. Whether new solutions need to fork it or not is a different question. For example, Cursor absolutely has made significant changes for user benefit that stock VSCode with Copilot doesn’t have.

    • lemming 2 days ago

      I’m a professional IntelliJ plugin developer, this is absolutely not true. IntelliJ is incredibly customisable, and you can always code raw swing to achieve whatever you want if there’s something that the base platform doesn’t provide.

      • yunohn 2 days ago

        Appreciate the added perspective, I wasn’t aware. But from my usage of IDEA’s product, I’ve yet to find an extension that makes any major changes to the UI/UX? I see them way more often in the VSCode community. In fact, I find the IDEA plugin system incredibly limited overall. Why do you think that’s the case? Or maybe I’ve been in a bubble and you can point to some interesting ones that I’ve maybe missed?

        • lemming 2 days ago

          More than it not being possible, it's not regarded as desirable - IntelliJ users greatly value consistency. The fact that the experience is consistent across different languages is one of the things I like best about IntelliJ. Can you give me an idea of the sort of thing that you'd like to see that you haven't seen?

          Here's an example of something I recently added (an inline inspector), that my main competitor (VS Code) said wasn't possible with the VS Code APIs: https://cursive-ide.com/blog/cursive-1.14.0-eap1.html. I have another major feature that I don't have good online doc for, which is also not possible with the VS Code API (parinfer, a Clojure editing mode). This gives you an idea of what it looks like, but this is old and my implementation doesn't work much like this any more: https://shaunlebron.github.io/parinfer.

  • jdgoesmarching 3 days ago

    Likely because ~70% of OpenAI’s revenue comes from ChatGPT Plus/Teams/Enterprise. Model access is just not as profitable, so slapping on features to encourage upgrades is their best path forward.

    It’s not great:

    https://www.wheresyoured.at/oai-business/

  • amelius 2 days ago

    The main issue I have is that I have to share my code. My employer won't allow it (they don't even allow us to use gmail).

    • underlines 2 days ago

      You either do local inference or get Azure OpenAI to have your own private gpt-4o or whatever. :)

  • Timber-6539 2 days ago

    Open source fixes this problem. This is basically Llama's moat.

  • benreesman 3 days ago

    The icky demo of how much worse /r/LinkedinLunatics is going to get aside: is this the front foot of a globe-spanning, cutting edge research lab at the very efficient frontier of human and machine intelligence?

    Or is it yet another lame distraction effort around the abject and embarrassing failure to ship GPT-5?

    These people are pretty shameless in ways that range from “exceedingly poor taste” to “interstate wire fraud” depending on your affiliation, but people who ship era-defining models after all the stars bounced they are not.

  • smrtinsert 2 days ago

    A million times this. I have 0 interest in this product.

  • wseqyrku 3 days ago

    Only if you could commit the changes in the browser and pull locally?

    • mark_l_watson 2 days ago

      Replit with AI support is web based and has GitHub integration.

  • nonethewiser 3 days ago

    I mean… thats just what an IDE is. Integrated. Existing IDEs can integrate models via plugins or they can build new IDEs with first party support.

epolanski 3 days ago

Small ot, but it's quite interesting that the highest decisive impact generative AI is having right now is on tech workers and software developers in particular.

I'm more and more convinced that we're on the edge of a major shake up in the industry with all these tools.

Not getting replaced, but at this rate of improvements I can't unsee major changes.

A recent junior I have in my team built his first app entirely with chatgpt one year ago, he still didn't know how to code, but could figure out how to fix the imperfect code by reasoning, all of it as a non coder, and actually release something that worked for other people.

  • jen729w 3 days ago

    I'm a solo business owner who knows enough JS/TS/HTML/CSS to get by -- my product is a website full of information -- but I've never been 'a developer'.

    ChatGPT et. al. is a miraculous boost to my productivity. This morning I needed a function to iterate over some JSON and do stuff with it. Fairly mundane, and I could have written it myself.

    Doing so would have been boring, routine, and would have taken me at least an hour. I asked ChatGPT 4o and I got exactly what I wanted in 30 seconds.

    I can only hope that these tools enable more people like me to build more cool things. That's how it's affected me: I never would have hired another dev. No job is lost. I'm just exponentially better at mine.

    • genericacct 3 days ago

      it needs to be said that web dev is by far the area in which LLMs are more versed , i suppose due to the incredible amount of training data available. Other languages produce more allucinations for now.

      • authorfly 3 days ago

        A good way to work with this is to use different agents

        Aider is pretty biased towards Python, for example (it's sample prompts largely use and test on Python)

    • justinbaker84 3 days ago

      I have had a similar experience. I build wordpress websites and ChatGPT has allowed me to extend their functionality a great deal without needing to learn how to code PHP.

    • Timber-6539 2 days ago

      This is the killer app for gen AI. Everything else said about AI e.g replacing workers is just building castles in the air.

  • NomDePlum 3 days ago

    I'm sure that's true but I've also seen non-developers build systems using Access or solve complex critical business problems using excel and similar. I've seen a lot of junior developers tinker with apps and do great work from reading docs and tutorials. I was one of those myself a long time ago.

    For some problems this is a perfect solution. For a lot it's a short term fix that turns into a long term issue. I've been on many a project that's had to undo these types of setups, for very valid reasons and usually at a very high cost. Often you find them in clusters, with virtually no one actually having a full understanding of what they actually do anymore.

    Building the initial app is only a very small part of software engineering. Maintaining and supporting a service/business and helping them evolve is far harder, but essential.

    My experience is that complexity builds very quickly to a point it's unsustainable if not managed well. I fear AI could well accelerate that process in a lot of situations if engineering knowledge and tradeoffs are assumed to included in what it provides.

    • sanp 3 days ago

      I think in 4-5 years’ time there will be huge demand for fixing / maintaining these LLM-built apps. Perhaps the LLMs that follow can do it.

  • boodleboodle 3 days ago

    The more I think about it, the more I am convinced developers will be the "first to go" when AGI takes over. Before bloggers and youtubers. Because programming is an activity that requires the least amount of "grounding to reality" among all human activities. We made sure of this with layers and layers of convenient abstraction.

    What about developers that code the AI systems? Well.. I am sure AGI will come from other "bootsrapping AIs" just like we see with compilers that compile themselves. When I see Altman and Sutskever talking about AGI being within reach, I feel they are talking about this bootstrapping AI being within reach.

    • emmanueloga_ 3 days ago

      Mechanical Turks will be the first to go.

      More seriously, the output quality of LLMs for code is pretty inconsistent. I think there's an analogy to be made with literature. For instance, a short story generated by an LLM can't really hold a candle to the work of a human author.

      LLM-generated code can be a good starting point for avoiding tedious aspects of software development, like boilerplate or repetitive tasks. When it works, it saves a lot of time. For example, if I need to generate a bunch of similar functions, an LLM can sometimes act like an ad-hoc code generator, helping to skip the manual labor. I’ve also gotten some helpful suggestions on code style, though mostly for small snippets. It’s especially useful for refreshing things you already know—like quickly recalling "How do I do this with TypeScript?" without needing to search for documentation.

      Anyway, literature writers and software engineers aren't going to be replaced anytime soon.

      • visarga 3 days ago

        > Mechanical Turks will be the first to go.

        On the contrary, human annotation work is stepping up now because we create so many more prompts and want to test them.

      • Hard_Space 3 days ago

        Human studies participants have a safer job than firemen. The trouble is that it is already woefully underpaid, and unsettled industries will increase supply.

        There are crucial quality issues with Mechanical Turk, though, and when these really start damaging AI in obvious ways, the system (and the compensation, vetting procedures and oversight) seems likely to change.

      • torginus 3 days ago

        Those jobs are already outsourced if possible (remember RLHF for ChatGPT outsourced to Kenya, Facebook content moderation to India?) And if they aren't, that's usually for regulatory reasons.

      • razodactyl 3 days ago

        Yes, but don't dismiss the laws of scaling and the inability of ourselves to predict the future. Wait But Why did a great take on the AI leap.

        Imagine that current AI is already curating and generating datasets for the next generation.

        Also consider that what we have now is only possible because hardware capability increased.

        • satvikpendem 3 days ago

          Isn't scaling already a big problem in generative AI? Apparently there is not enough data to actually appreciably increase the quality of outputs, and lots of data is also being polluted by AI generations. This is why these companies are now shifting to more ancillary features like in this article (which Claude already has to some extent, as well as Copilot in VSCode and other editors).

          • NitpickLawyer 3 days ago

            > Apparently there is not enough data

            This was pretty much refuted by Meta with their LLama3 release. Two key points I got from a podcast with the lead data person, right after release:

            a) Internet data is generally shit anyway. Previous generations of models are used to sift through, classify and clean up the data

            b) post-processing (aka finetuning) uses mostly synthetic datasets. Reward models based on human annotators from previous runs were already outperforming said human annotators, so they just went with it.

            This also invalidates a lot of the early "model collapse" findings when feeding the model's output to itself. It seems that many of the initial papers were either wrong, used toy models, or otherwise didn't use the proper techniques to avoid model collapse (or, perhaps they wanted to reach it...)

          • Bloedcoins 2 days ago

            We got a jump start with 'a ton of data' and discovering now that less but better data is actually better.

            So yes we will see manual labor to finetune the data lair but this will only be necessary for a certain amount of time. And in parallel we also help by just using it: With the feedback we give these systems.

            A feedback loop mechanism is fundamental part of AI ecosystems.

    • kovezd 3 days ago

      That thinking is driven by limited product imagination.

      The kind of apps that will be built in the next 5 years,are nowhere near what we have today.

      Developers will need to update their skillset, though.

      • visarga 3 days ago

        It's not a fixed lump of work. Widening the road makes the traffic increase, not congestion decrease. Same with coding, if we could do more, we would.

    • kristopolous 3 days ago

      It's the same thing that happened with mechanical looms. Programming will go from an artisan craft to the sweatshops and software programming will become low paying precarious gig work.

      This is all about labor and capital. When people toss hundreds of billions at something it almost always is.

      The social relationship doesn't have to be this way. Technological improvement could help us instead of screw us over. But we'd first have to admit that profit exploitation isn't absolutely the best thing ever and we'll never do that. Soooo here we are.

    • dboreham 3 days ago

      Sticking to my prediction that lawyers will be first.

      • ajmurmann 3 days ago

        That's an interesting example. Lawyers, unlike software developers, as a group go out of their way to ensure that they'll need need and cannot be replaced by others or automation. They push for certain processes to require lawyers. We on the other hand are more eager to automate ourselves than anything else. Maybe that will boost or productivity and make us even more valuable and highly paid or maybe we'll end up unemployed. Fascinating contrast between the professions

      • choilive 3 days ago

        Lawyers have a tricky thing called a license to practice law. Software developers have no such protection.

        • forgotoldacc 3 days ago

          Programmers have been against unions/licenses in software development because we saw it as slowing down our ability to job hop, potentially massively lowering wages, and making a large barrier to entry (which hurts a lot of us who started as kids/teens).

          Now there's a chance that this unregulated wild west with a low barrier to entry that's benefited us for so long will come back to bite us in the ass. Kind of spooky to think about.

          • kristopolous 3 days ago

            I don't know if that's true. If I was in a WGA/DGA equivalent in my field that offered health care and scale pay that would be great!

            I bet if you asked most programmers whether they'd like to have a professional guild similar to the writers who just went on strike, you'd probably be surprised, especially for gaming devs.

          • jdiez17 3 days ago

            I would be in favor of some kind of state approved exam/certification to ensure programmers have at least some basic knowledge of computer security and engineering ethics.

            > making a large barrier to entry (which hurts a lot of us who started as kids/teens)

            I doubt it. In my experience autodidacts are the best programmers I know.

            • hakanderyal 2 days ago

              The problem is they would most certainly require a CS degree to apply (if it's left to the bureaucrats), which would be the barrier.

        • kylecazar 3 days ago

          Robots are already successfully performing surgeries unassisted, despite a license being required in that field.

          I do think barred lawyers will have a role for quite a while, but it is plausible it shrinks to oversight.

          • gugagore 3 days ago

            What robot is performing what surgery unassisted?

            • filoleg 3 days ago

              Yeah, plus one on this one, extremely curious to hear as well.

              I am aware that remote robot surgeries have been a thing for quite a bit of time, but this is the first time ever I am hearing about unassisted robot surgeries being a thing at all.

              A follow-up question: if an unassisted robot surgery goes wrong, who is liable? I know we have a similar dilemma with self-driving cars, but I was under the impression that things are way more regulated and strict in the realm of healthcare.

        • lttlrck 3 days ago

          Defending yourself might become a whole lot easier.

        • bumby 3 days ago

          >Software developers have no such protection.

          I wonder if this is at all related to NCEES re-releasing their controls licensure option?

      • jonahx 3 days ago

        This fundamentally misunderstands what lawyers do. You're prediction might be right for paralegals and very junior attorneys. But the lawyers who make real money are not doing rote work. It's lunches, negotiation, politics, and for trial attorneys performance and debate. Social activities, human skills. They'll always be around.

        • dboreham 2 days ago

          s/lawyers/software-developers/

      • frankdenbow 3 days ago

        Agree with that, the startup lawyer I used to use now charges $1,100 / hour which is untenable, would much rather get the basics covered by an AI lawyer.

      • mekoka 3 days ago

        Live counsel in sensitive situations is definitely in the works, if not already in beta. Get pulled over by cops, or have the authorities asking to enter the premises, bring up your AI counsel and let them talk to the officer, before giving you advice on how to proceed. I can even envision an eventual future where public pressure results in law enforcement being paired with an AI assistant to help refresh their memory on some of the articles of the law.

      • jdiez17 3 days ago

        FWIW I used various LLMs to draft a freelance work contract with good results. Of course I carefully read, thought about every clause, edited, etc. It’s probably not as bulletproof as something a lawyer could produce, but it was definitely a big help.

      • jahewson 3 days ago

        Nah, the stakes are too high.

      • vlunkr 3 days ago

        There’s a company being fined for exactly this right now. That might be a deterrent.

    • torginus 3 days ago

      I think there's a lot of evidence out there that supports your theory.

      - There's the biggest/most high quality training corpus that captures all aspects of dev work (code, changes, discussions about issues, etc.) out there with open source hosting sites like GitHub

      - Synthetic data is easy to generate and verify, you can just run unit tests/debugger in a loop until you get it right. Try doing that with contracts or tax statements.

      - Little to no regulatory protections.

    • probably_wrong 2 days ago

      > Because programming is an activity that requires the least amount of "grounding to reality" among all human activities.

      Maybe once you're deep into APIs that talk to other APIs, but near the surface where the data is collected there's nothing but "grounding to reality".

      As my professor of Software Engineering put it: when building a system for counting the number of people inside a room most people would put a turnstile and count the turns. Does this fulfill the requirement? No - people can turn the wheel multiple times, leave through a window, give birth inside the room, etc. Is it good enough? Only your client can say, and only after considering factors like "available technology" and "budget" that have nothing to do with software.

    • juthen 3 days ago

      Maintenance and long-term development will still require grounding to reality. A super-CEO might build the app themself, but keeping it running 5, 10, 20 years is a completely different deal. I imagine developers will eventually start to act more like librarians, knowing their system very well but not necessarily in charge of making new content.

    • raincole 3 days ago

      When AGI comes, who are the first to go is solely decided by regulation. Nothing else.

    • bamboozled 3 days ago

      No one has a job when AGI "takes over"? Weird take.

    • b212 3 days ago

      With AGI you won’t need most of the human race anymore, developers are just the tip of the iceberg.

      Luckily ChatGPT and the rest have nothing to do with an AI not to mention AGI.

      • sitkack 3 days ago

        We didn't need humans in the first place. The collective "we" can decide that we want to do anything. People have this crazy fatalistic attitude about AI taking over, billionaries ejecting to Mars and humans becoming irrelevant. Let me tell you, humans have been irrelevant since forever.

      • maxwell 3 days ago

        Won't need most of humanity wasting time gatekeeping and toiling, to instead focus on study and leisure.

        • rayval 3 days ago

          That would be nice, to gambol in a lush, sunny meadow, perusing a leather-bound volume that we read in the original Greek or Sanskrit.

          Unfortunately, I fear we will instead end up sweaty and dirty and bloodied, grappling in the parched terrain, trying to bash members of a neighboring clan with rocks and wooden clubs, while a skyline of crumbling skyscrapers looms in the distance.

        • bbu 3 days ago

          Increased productivity has to more, not less toiling in the last 50 or so years. This is not gonna be different with AGI (if it ever happens). Some people will get very rich while the rest is still gonna work just as much as they are now. The jobs just gonna suck even more.

      • Jeff_Brown 3 days ago

        It's not AGI yet but has everything to do with it.

        • lttlrck 3 days ago

          Hmmm. I'm going to withhold judgement until we get AGI. It might have about as much to do with it as the invention of the wheel and the lunar buggy.

          • maxwell 3 days ago

            How will you know when it's sufficiently general?

  • frankdenbow 3 days ago

    I envision lots of solo coders now able to compete with bigger companies, creating more niche software that meets peoples need better than generic solutions. Truly exciting time to be in software.

    • deergomoo 3 days ago

      I have the exact opposite concern. Software is/was one of the few industries where a solo person can already compete with big companies, or at least live comfortably alongside. Just look at the market for niche or boutique software that does stuff like calendars or emails or task management in a particular way.

      To me the current direction LLMs are headed seems like it will just further entrench the power of the trillion dollar megacorps because they’re the only people that can fund the creation and operation of this stuff.

    • jillesvangurp 3 days ago

      Yes, this is more or less my life. I run a small bootstrapped startup and do some consulting on the side. I have a few decades of experience. So it's not like I can't do things myself. But chat GPT has enormously enhanced my output. It's rarely perfect but I can bridge the gap usually by completing the job. My role is increasingly directing changes and telling it what needs doing next.

      Canvas sounds useful. I'll be playing with that as soon as I can access it.

      Another useful thing in chat gpt that I've been leveraging is its memory function. I just tell it to remember instructions so I don't have to spell them out the next time I'm doing something.

    • mark_l_watson 2 days ago

      I agree, and want to add: AI will make it possible to inexpensively produce small tailored applications that only support locally required functionality. The advantage to this is very small code bases that are easier to understand. In other words, it makes it possible to avoid huge mega apps that any particular user might use a small percentage of the functionality.

      Also, is it possible that smaller focused apps will have few edge cases and be more reliable?

  • fnordpiglet 3 days ago

    I think that’s probably more that programmers are early adopters of new technologies and the people building the technologies are programmers. There are lots of roles that are more easy to automate completely with an LLM as they improve, but are harder to make inroads with. I expect as the initial waves of LLM startups mature and winnow we will some fields almost entirely automate. For instance medical coding feels totally ripe since it’s basically a natural language classification exercise easily fine tuned.

  • EZ-E 2 days ago

    Am I the only one not seeing it? AI is very useful assistant, boosts productivity, and makes coding easier, but ultimately in real life scenarios beside POCs, it cannot replace a human. You quickly reach a threshold where explaining and getting the AI to do what you want is actually harder than doing it. What happens if your LLM built app has a bug and the AI does not "get" it?

    • Bloedcoins 2 days ago

      You need to ask 'when'.

      ChatGPT shows a clear path forward. Feedback loop (consistent improvement), tooling which leverages all of llms powers, writing unit tests automatically and running code (chatgpt can run python already, when will it able to run java and other langauges?)

      And its arleady useful today for small things. Copilot is easier and more integrated than googling parameters or looking up documentation.

      UIs/IDEs like curser.ai are a lot more integrated.

      What you see today is just the beginning of a something, potentially big.

      • EZ-E 2 days ago

        I respect your opinion and you could be right, but I don't buy it so far. While integrations have improved, for the LLM models everything relies on, we don't see major advances anymore. Compare the jump from GPT3.5 to 4, vs the next iterations, it still suffers from the same limitations LLMs have (context length, overconfidence, hallucinations). Maybe I'm too impatient.

        • Bloedcoins 2 days ago

          From a research point of view, context length got a lot better in the last year and continues to become better.

          Chatgpt just released new voice mode.

          It took over a year to get GitHub Copilot rolled out in my very big company.

          People work left and right to make it better. Every benchmark shows either smaller models or faster models or better models. This will not stop anytime soon.

          Flux for Image generatin came out of nowhere and is a lot better with faces and hands and image description than anything before it.

          Yes the original jump was crazy but we are running into capacity constrains left and right.

          Alone how long it takes for a company to buy enough GPUs, building a platform, workflows, transition capacity into it, etc. takes time.

          When i say AI will change our industry, i don't know how long it takes. I guess 5-10 years but it makes it a lot more obvious HOW and the HOW was completly missing before GPT3. I couldn't came up with an good idea how to do something like this at all.

          And for hallucinations, there are also plenty of people working left and right. The reasoning of o1 is the first big throw of a big company to start running a model longer. But for running o1 for 10 seconds and longer, you need a lot more resources.

          Nvidias chip production is currently a hard limit in our industry. Even getting enough energy into Datacenters is a hard limit right now.

          Its clearly not money if you look how much money is thrown at it already.

  • devjab 3 days ago

    I think a lot of developers will get replaced by AI. I’ve worked in digitalisation and automation for a couple of decades now and I’ve gone into a role which specialises in helping start-ups grow their IT into something will will actually work as they transition into enterprise organisations. I think almost all the work I replace or optimise will be done without developers in a few years (or maybe in a decade). This is everything related to data transformation, storage and transportation - to applications and websites.

    In the organisation I currently work, we’re already seeing rather large amounts of digitalisation done by non-developers. This is something organisations have tried to do for a long time, but all those no-code tools, robot process automation and so on quickly require some sort of software developer despite all their lofty promises. This isn’t what I’m seeing with AI. We have a lot of people building things that automate or enhance their workflows, we’re seeing api usage and data warehouse work done from non-developers in ways that are “good enough” and often on the same level or better than what software developers would deliver. They’ve already replaced their corporate designers with AI generated icons and such, and they’ll certainly need fewer developers going forward. Possibly relying solely on external specialists when something needs to scale or has too many issues.

    I also think that a lot or “standard” platforms are going to struggle. Why would you buy a generic website for your small business when you can rather easily develop one yourself? All in all I’d bet that at least 70% of the developer jobs in my area aren’t going to be there in 10 years and so far they don’t seem to open up new software development jobs. So while they are generating new jobs, it’s not in software development.

    I’m not too worried for myself. I think I’m old enough that I can ride on my specialty in cleaning up messes, or if that fails transition into medical software or other areas where you really, really, don’t want the AI to write any code. I’d certainly worry if I was a young generalist developer, especially if a big chunk of my work relies on me using AI or search engines.

  • faitswulff 3 days ago

    I know a similar non-coding founder who was using LLMs to create a full fledged TypeScript based SaaS product and regularly comes to me with high level architecture questions, but also doesn't know or care to figure out what HTTP methods are, variable names are a mishmash of whatever case the LLM decided to generate that day, and there are no tests whatsoever. It's held together by sheer force of manual QA.

    ...but. It does work.

    • Kostchei 3 days ago

      This my technique- test the output, make sure it works (in the sense of outputs) the way I want. Test the input edge cases, move on. Occasionally when I can't get it to do what I want the llm suggests things like logging of output between functions etc- in which case they get added, but at the end I ask it to take out all the logging and make the code more concise.

      And sometimes it breaks in ways I can't fix - so rolling back or picking a new patch from a know break point becomes important.

      16 hours for my first azure pipeline, auto-updates from code to prod, static app including setting up git, vscode, node, azure creds etc. I chose a stack I have never seen at work (mostly see AWS) and I am not a coder. Last code was Pascal in the 1980s.

      3rd app took 4 hours.

      Built things I have wanted for 30 years.

      But yes- no code understanding, brute force.

      • sitkack 3 days ago

        Which is awesome, and if you wanted to understand the code, it would do an amazing job of tutoring you. I love seeing people being able to solve their own problems without the need for a professional programmer.

      • kortilla 3 days ago

        This is how you end up with software full of vulnerabilities.

      • komali2 3 days ago

        The downside I've noticed is if I do this, I can't explain how I "solved the problem" during job interviews. I tried once, "I didn't, chatgpt solved it for me," and they laughed and I didn't get the job, so I stopped admitting that and said I just use chatgpt to quickly write up boilerplate for me.

        • Timber-6539 2 days ago

          How was admitting this supposed to help you in an interview? Anyway you won't learn anything if you don't review and go deeper into the code you've written with ChatGPT.

      • mark_l_watson 2 days ago

        re: “Built things I have wanted for 30 years.”

        +1 same for me.

        A few times a month I now build something I have wanted in the past but now I can afford the time to build. I have always prided myself at being pretty good at working with other human developers, and now I feel pretty good at using LLM based AI as a design and coding assistant, when it makes sense to not just do all the work myself.

      • jdiez17 3 days ago

        Thanks for sharing your story! Reminds me a bit of Prometheus, stealing fire from the gods and giving it to humanity.

    • Bloedcoins 2 days ago

      Yeah until you realize that a lot of small scale companies do the same thing with avg/bad developers for ages and ... It does work.

      IF you don't have a central security team like in big companies or the need for an audit, what you don't know is what you don't care for.

      Obviously until its too late but holy shit i have seen too much garbage just working for waaaay to long.

  • aussieguy1234 3 days ago

    As an engineer I've spoken to a couple of different designers who are building out prototypes of their startup ideas using LLM assistance with the coding.

    While no actual engineer is involved at that stage, if they got funded then I'm sure their next step will be to hire a real engineer to do it all properly.

    • Kostchei 3 days ago

      <insert manic laughter> I mean that might happen, but why get funding? Why not move to market immediately? Without debt. Get in the thick of it. Just do. You want a wage or do you want a product that does the thing. Because sometimes, with the llm, you can just build the thing. The marketing, the compliance, you might hire for that, or you might also out-source to the llm.

      Why would you hire? Either it works- in the sense of does the job and is cost effective- or it is not.

      Is there a situation where paying 100's of k of wages makes a thing suddenly a good idea? I have doubts.

      • aussieguy1234 3 days ago

        Let's see if your little app can handle millions of daily users without an actual engineer. Your average application will fall over before that.

        It'll be some time before an AI will be able to handle this scenario.

        But by then, your job, my job and everyone else's job will be automated, it's entirely possible the current economic system will collapse in this scenario.

  • fervor 3 days ago

    I would like to see the app prompted by junior dev in question.

  • ijidak 3 days ago

    In sci-fi, they'll often show a single super developer create a virtual world that realistically no single human would have the time to create.

    Ready Player One comes to mind. Maybe Tron Legacy.

    But, with AI productivity, it looks like AI will allow such super developers to create monstrously large worlds.

    I can't wait to see this generation's Minecraft or the next Linus.

  • vlunkr 3 days ago

    It’s hardly the first time someone has shipped an app within a year of starting to code. It’s impressive, but not revolutionary.

    • brailsafe 3 days ago

      I also wonder what the person would have been hired for... maybe QA? I was doing this with random relevant scripts nearly 20 years ago but wasn't given a job where code would be relevant for the task until loooooong after I could comprehend what I was doing

    • epolanski 2 days ago

      Nono, he has shipped an app before learning to code.

      The fact that he liked and enjoyed coding was what actually prompted him into learning to code after that first experience.

      • vlunkr 2 days ago

        So he’s a naturally talented developer who learned to code as he created his first app. Maybe he didn’t understand specifics, but you have to be able to intuit a lot to string a bunch of AI snippets into an app.

  • m3kw9 3 days ago

    With how easy it is to do until functions, there is gonna be a new baseline in functionality and speed of development for what people expect of software in general. It’s gonna ramp up from here till we get to generative UIs

  • dzhiurgis 3 days ago

    Regulating either developer credentials (ie are you certified for this type of solution) or outputs (formal validation of each release) comes to mind.

    It’s gonna sound like gatekeeping but letting people without any experience to build impactful software is risky.

    • latentsea 3 days ago

      Even letting people with experience is risky.

GiorgioG 3 days ago

Not available in my favorite IDE? Not even going to bother.

qwertox 3 days ago

OpenAI is so far ahead of the competition. They're able to implement anything they like from competitors, and then some.

Claude really needs a sandbox to execute code.

If Anthropic would be smart about it, they'd offer developers ("advanced users") containers which implement sandboxes, which they can pull to their local machines, which then connect to Claude so that it can execute code on the user's machine (inside the containers), freeing up resources and having less security concerns on their side. It would be up to us if we wrap it in a VM, but if we're comfortable about it, we could even let it fetch things from the internet. They should open source it, of course.

In the meantime Google still dabbles in their odd closed system, where you can't even download the complete history in a JSON file. Maybe takeout allows this, but I wouldn't know. They don't understand that this is different than their other services, where they (used to) gatekeep all the gathered data.

  • vineyardmike 3 days ago

    This is an odd comment, because you mention Claude and Google, both of which already have similar/adjacent features. For a while. OpenAI is actually defensive/behind.

    1. Claude has “artifacts” which are documents or interactive widgets that live next to a chat.

    2. Claude also has the ability to run code and animated stuff in Artifacts already. It runs in a browser sandbox locally too.

    3. Gemini/Google has a ton of features similar. For example, you can import/export Google docs/sheets/etc in a Gemini chat. You can also open Gemini in a doc to have it manipulate the document.

    4. Also you can use takeout, weird of you to criticize a feature as missing, then postulate it exists exactly where you’d expect.

    If anything this is OpenAI being defensive because they realize that models are a feature not a product and chat isn’t everything. Google has the ability and the roadmap to stick Gemini into email clients, web searches, collaborative documents, IDEs, smartphone OS apis, browsers, smart home speakers, etc and Anthropic released “Artifacts” which has received a ton of praise for the awesome usability for this exact use case that OpenAI is targeting.

    • qwertox 3 days ago

      I mean the following:

      `use matplotlib to generate an image with 3 bars of values 3, 6, 1`

      followed by

      `execute it`

      https://chatgpt.com/share/66fefc66-13d8-800e-8428-815d9a07ae...

      (apparently the shared link does not show the executed content, which was an image)

      https://imgur.com/a/PkJCnKO

      Which has interesting consequences, because I saw it self-execute code it generated for me and fix the errors contained in that code by itself two times until it gave me a working solution.

      (Note that I am no longer a Plus user)

      ---

      Claude: I apologize, but I don't have the ability to execute code or generate images directly. I'm an AI language model designed to provide information and assist with code writing, but I can't run programs or create actual files on a computer.

      ---

      Gemini: Unfortunately, I cannot directly execute Python code within this text-based environment. However, I can guide you on how to execute it yourself.

      ---

      > 4. Also you can use takeout

      I just checked and wasn't able to takeout Gemini interactions. There are some irrelevant things like "start timer 5 minutes" which I triggered with my phone, absolutely unrelated to my Gemini chats. takeout.google.com has no Gemini section.